Efficient Repair of Polluted Machine Learning Systems via

Efficient Repair Of Polluted Machine Learning Systems Via-Free PDF

  • Date:06 May 2020
  • Views:29
  • Downloads:0
  • Pages:13
  • Size:535.65 KB

Share Pdf : Efficient Repair Of Polluted Machine Learning Systems Via

Download and Preview : Efficient Repair Of Polluted Machine Learning Systems Via


Report CopyRight/DMCA Form For : Efficient Repair Of Polluted Machine Learning Systems Via


Transcription:

ASIA CCS 18 June 4 8 2018 Incheon Republic of Korea Y Cao et al. K ARMA thus reduces the manual effort required down to two parts K ARMA is effective Specifically K ARMA restores the accuracy. First it assumes that some users report misclassified testing samples of polluted learning models against a third dataset to the vanilla. e g as in the Microsoft Tay example or spam detection and for one within 1 differences. added security it relies on administrators to verify the user reports. This paper makes three main contributions At a conceptual level. K ARMA does not require all misclassified samples to be collected. causal unlearning is the first approach to efficient repair of learn. upfront before it repairs a system instead our evaluation shows. ing systems and may inspire many possible systems toward this. that it can incrementally clean a system as users gradually report. direction At a system level we have built K ARMA a causal un. misclassifications Second it relies on administrators to inspect. learning system that uses several mechanisms to efficiently deter. the set of polluted samples it returns K ARMA determines the set. mine the set of polluted data samples with high precision and recall. of polluted samples leveraging causality of misclassifications not. K ARMA is open source and available at the following repository. contents of the samples Therefore it may have false positives such. https github com CausalUnlearning KARMA At an evaluation. as flagging unpolluted outliers in the training set However we view. level we show that our approach works with real world machine. it an advantage to use K ARMA to detect outliers from the training set. learning systems and greatly reduces manual effort required to repair. It may also have false negatives such as missing polluted training. a polluted system, samples However if the remaining polluted samples do not cause. Our work is only the first step toward practical repair of learning. user noticeable misclassifications their harm may be little. systems more challenges lie ahead How can we perform causal. To ease discussion we term the set of user reported misclassified. unlearning on other machine learning algorithms and systems How. test samples the oracle set Administrators can augment this ora. can we repair a system that experienced other types of attacks target. cle set with correctly classified test samples for better results We. ing machine learning While removing training samples is one way. assume that all samples in the oracle set are assigned their correct. to repair or improve a learning system adding samples is another. classifications They may come from aforementioned administrator. which K ARMA does not support We hope other researchers will join. verification or automated approaches such as malware detection. us in addressing these challenges, via sophisticated dynamic analysis In either case we can afford to. verify the oracle set but not the entire training set because the oracle. set is often orders of magnitude smaller than the training set 2 THREAT MODEL. Although the causal unlearning idea is intuitive K ARMA faces The threat model of K ARMA assumes one learning system and three. two challenges First the search space for causality in the training parties i e an administrator of the system users of the system. set is very large but at the same time K ARMA needs to inspect the and an attacker The administrator is absolutely trustworthy being. entire space to avoid evasion Second a large training set can also responsible for training and maintaining the learning system the. make it costly to compute a new model after removing a subset To attacker is malicious and tries to subvert the system by polluting. speed up the search for causality K ARMA adopts a heuristics that the training dataset most users are trustworthy but some of them. balances search coverage and speed based on that similar causes may have malicious intent Note that we do not restrict the capability. will lead to similar effects with a high probability Specifically if of the attacker i e theoretically the attacker can pollute arbitrary. two training samples serving as causes in K ARMA are very similar number of training data In practice as long as the pollution is. and share the same label their influences on the learning system effective the attacker also wants to minimize the number of polluted. are also similar i e there is a higher chance that they are both training data and reduce her chance of being caught Depending on. polluted or unpolluted To speed up model computation K ARMA how the administrator collects the training dataset we list two attack. leverages machine unlearning 14 but also works with incremental scenarios where an attacker can pollute training set. or decremental machine learning 15 20 37 42 43 Scenario One Mislabelling Attack In this scenario an adminis. We evaluated K ARMA on three systems covering two popular trator of a learning system adopts crowdsourcing such as asking. learning algorithms Bayes and SVM and two application domains Amazon Mechanical Turks to label training samples Some of the. spam and malware detection Our results show crowdsourcing workers have malicious intents i e they will misla. bel2 samples provided by the administrator to pollute the learning. K ARMA reduces manual efforts Specifically in an attack scenario model In this scenario the capability of attackers is limited in pol. from Nelson et al 33 i e 1 of samples are polluted K ARMA luting the labels but not contents of training samples because all the. reduces the manual effort from the entire training set to 3 of samples are provided by the administrator. the training set i e 2 as an oracle set and 1 as the identified Scenario Two Injection Attack In this scenario an administrator. polluted samples an over 30 reduction of a learning system tries to collect malicious samples such as. spam and malware through a honeypot based technique An attacker. K ARMA is robust to a variety of attacks with different parameters. figures out the purpose of the honeypot and then intentionally sends. Specifically K ARMA repairs learning models affected by a wide. crafted polluted samples to the honeypot so that such samples will. variety of 95 data pollution attacks ranging from mislabelling to. be include in the training set Note that attackers different from. injection attacks with different tactics such as targeted and blind. the first scenario are able to craft and inject contents However the. and having different pollution rates from 0 5 to 30. K ARMA is accurate Specifically K ARMA identifies 99 2 pol 2 In this paper misclassified samples emails refer to these that are incorrectly classified. luted samples in median with the minimum as 98 0 and the by the learning model mislabeled samples emails refer to these in the training set that. maximum 99 97 are incorrectly labeled by the attacker. Causal Unlearning ASIA CCS 18 June 4 8 2018 Incheon Republic of Korea. Attacker Admin Users Data Procedure Our procedure Stage which samples are polluted and whether the misclassification is. caused by data pollution step five After that the administrator can. ask our system to repair the learning model by removing verified. Pollute Training Existing Training polluted samples step six. 1 Train Stage,Data Filters, Note that K ARMA greatly relieves the burden of the oracle With. Use Stage out K ARMA an administrator needs to first verify misclassifications. Spam Labeled, reported by users and confirms that the model misbehaves Then the. Detector Emails oracle needs to confirm misclassifications go over all training sam. Update Repair Stage ples and find pollutions Now with K ARMA the oracle still verifies. misclassifications reported by users but then only needs to verify. 3 Verify Misclassi cation, the dataset reported by the users and misclassification cause both.
smaller than the entire training dataset As shown in the evaluation. 4 Causality Analysis Section 6 the size of oracle set is less than 2 of training data. The size of misclassification cause highly depends on the attacker s. strategies which varies from 0 5 to 30 of training data in our. Data experiment According to Nelson et al 33 only 1 of samples. are needed to subvert an email filter It is our future work to further. 6 Unlearn decrease the size of samples to be inspected by the administrator. Figure 1 Deployment Model spam detectors as an example 4 DESIGN. attacker can only control one class of samples i e malicious ones We present the design of K ARMA in this section. because a honeypot usually collects just malicious samples. 4 1 Overview, 3 DEPLOYMENT MODEL Let us first discuss the inputs and outputs of K ARMA Specifically. K ARMA takes three inputs one machine learning model M usually. When we deploy K ARMA with a learning system there are tree. a replicate of the deployed learning model for analysis purpose. stages in the lifecycle of deployment training use and repair In the. and two datasets The first set S t r ainin the one used to generate. training stage the administrator will train a learning model based. M is large and potentially polluted by an attacker the second set. on potentially polluted training data Then in the use stage the. Sor acl e is a small dataset mostly coming from misclassification. administrator will obtain feedbacks of the learning system from. reported by users of M Step 2 of the deployment model in Figure 1. third parties such as users of the model and other independent. Sor acl e is verified by an oracle such as the administrator of M. testing parties including VirusTotal for malware detection After that. The output of K ARMA is another dataset Scause that leads to the. based on the feedbacks especially misclassification reports in the. misclassifications of Sor acl e when classified by M In K ARMA the. repair stage the administrator will repair the model with the help of. degree of misclassifications can be represented as the detection. an oracle such as a human performing code reviews and a dynamic. accuracy of M against Sor acl e defined in Equation 1. analyzer exploring and examining program behaviors. An Example Deployment with Spam Detectors Figure 1 shows. x is miscl assif ied by M x S or acl e, the deployment model of K ARMA by using an example of spam Accur acySor acl e 1 1. S or acl e, detectors where the oracle is a trusted human Say a spam detector. Now let us discuss how the administrator validates Sor acl e which. is trained by an administrator with a potential polluted training set. comes from what the users report Specifically the administrator s. step one and deployed together with an email client When training. job can be summarized as follows Note that the amount work for. the system the administrator might have already deployed exist. the administrator is minimized because everything is performed on. ing approaches which are orthogonal to K ARMA to filter potential. a small number of Sor acl e not the entire S t r ainin. polluted emails 17 34 39 and make the model robust However. some polluted emails may have bypassed the filter and still make Adding user reported misclassified samples to Sor acl e iteratively. the learning model misclassify samples as evident by existing at Once the administrator collects some misclassified samples as. tacks 35 45 Sor acl e she can run K ARMA using to partially repair M by find. Then the users of this email client complain about misclassifica ing the misclassification cause and removing a subset of polluted. tions and report misclassified emails to the administrator step two training samples Because M is still polluted and produces incor. The administrator or other trusted person i e an oracle verifies rect results i e misclassifying samples users of M will report. these reported misclassifications and uses them as an input dataset further misclassifications to the administrator Then the admin. for K ARMA called the oracle set step three To improve accuracy istrator can construct a new Sor acl e and ask K ARMA to further. the oracle set can include a small number of correctly classified repair M We have a detailed evaluation about this scenario in. emails as well Section 6 5, Next the administrator deploys K ARMA to find the cause of Removing falsely reported samples from users with malicious. misclassifications in the oracle set step four The cause a subset intent Once the administrator finds that some reported samples. of the training set will be verified by the administrator to confirm are correctly classified she can remove such samples as shown. ASIA CCS 18 June 4 8 2018 Incheon Republic of Korea Y Cao et al. in Step 2 of the deployment model in Figure 1 None of the false rest of the subsection for convenience misclassified data is referred. report will be fed into and thus influence K ARMA Further the to only one group of misclassified data After grouping we start. administrator may even block such users from reporting more clustering in each group individually and the clustering algorithm is. samples very similar to k means but using our divergence scores. Understanding the output of K ARMA and adding more samples to Here is how the first phase clustering misclassified data works. the training set if necessary to improve M K ARMA will find the K ARMA randomly selects k samples from misclassified data c 1 c 2. cause of misclassification which could be some correctly la. Efficient Repair of Polluted Machine Learning Systems via Causal Unlearning Yinzhi Cao Lehigh University Bethlehem PA yinzhi cao lehigh edu Alexander Fangxiao Yu Columbia University New York NY afy2103 columbia edu Andrew Aday Columbia University New York NY aza2112 columbia edu Eric Stahl Lehigh University Bethlehem PA ems316 lehigh edu

Related Books