SystemML Declarative Machine Learning on MapReduce

Systemml Declarative Machine Learning On Mapreduce-Free PDF

  • Date:03 Apr 2020
  • Views:31
  • Downloads:0
  • Pages:12
  • Size:1.16 MB

Share Pdf : Systemml Declarative Machine Learning On Mapreduce

Download and Preview : Systemml Declarative Machine Learning On Mapreduce


Report CopyRight/DMCA Form For : Systemml Declarative Machine Learning On Mapreduce


Transcription:

C A x B each input,A0 0 A0 1 A1 0 A1 1 B0 0 B1 0 Map item is sent. C0 0 A0 0 A0 1 B0 0 to 1 reducer,C1 0 A1 0 A1 1 B1 0 Shuffle. each input item,A0 0 A0 1 A1 0 A1 1 B0 0 B1 0 Map can be sent to. multiple reducers,1 0 for each k,Reduce compute,Shuffle Product Product Pki j Ai k Bk j. 0 0 1 0 P00 0 A0 0 x B0 0 P10 0 A0 1 x B1 0,P01 0 A1 0 x B0 0 P11 0 A1 1 x B1 0.
A0 0 x B0 0 A1 0 x B0 0 for each i j,Reduce compute. A0 1 x B1 0 A1 1 x B1 0 each input, Ci j k Ai k Bk j P00 0 P01 0 P10 0 P11 0 Map item is sent. C0 0 C1 0 to 1 reducer,0 0 1 0 Shuffle,Fig 1 RMM Replication based Matrix Multiplication. P10 0 P11 0 for each i j,Reduce aggregate, Consider the expression W HH in Step 8 of Algorithm 1 C0 0 C1 0. Ci j kPki j, This expression can be evaluated in one of two orders.
od1 W H H T and od2 W HH T At first glance picking Fig 2 CPMM Cross Product based Matrix Multiplication. the right order and performing this computation may seem. straightforward but the fact that matrix multiplication itself Problem Statement Build a scalable declarative machine. can be accomplished in multiple ways complicates matters learning system that. Figure 1 and Figure 2 show two alternative MapReduce. exposes a declarative higher level language for writing. plans for matrix multiplication details of the two plans will. ML algorithms thereby freeing the user from low level. be discussed in Section IV The RMM plan in Figure 1 im. implementation details and performance tuning tasks. plements a replication based strategy in a single MapReduce. provides performance that scales to very large datasets. job while the CPMM plan in Figure 2 implements a cross. and is comparable to hand tuned implementations of. product strategy that requires 2 MapReduce jobs The choice. individual algorithms, of RMM vs CPMM is dictated by the characteristics of the. covers a large class of ML and statistical algorithms. matrices involved in the multiplication To compute W HH T. whose computational cores are linear algebra primitives. we have to choose from a total of 8 plans first choose the. and iterative numerical optimization procedures These. order of evaluation od1 or od2 and for the chosen order. include but are not restricted to linear statistical models. choose from RMM or CPMM for each matrix multiplication. PCA PageRank Matrix Factorizations and so on, Instantiating the dimensionalities of the matrices reveals the. need to choose one plan over another In the context of topic The remainder of the paper is organized as follows In. modeling the number of topics t is much smaller than the Section II we present SystemML in which ML algorithms. number of documents d and the number of words w As a are expressed in a higher level language subsequently com. result od1 will never be selected as the evaluation order since piled and automatically parallelized to execute in Hadoop an. W H produces a d w large intermediate matrix whereas open source implementation of MapReduce We then describe. HH T in od2 results in a t t small matrix When d 107 the individual components of SystemML in Section III We. w 105 and t 10 H is of medium size and the result of discuss the role of cost based optimization by showing two. HH T is tiny The replication based approach RMM performs alternative execution plans for the expensive matrix multiplica. very well for both matrix multiplications The best plan with tion operation We then present extensive experimental results. od2 is to use RMM for HH T followed by another RMM Section V to demonstrate the scalability of SystemML and. for the pre multiplication with W Empirically this plan is the effectiveness of the optimizations performed at various. 1 5 times faster than the second best plan of using CPMM stages. followed by RMM However when w is changed to 5 107. II S YSTEM ML OVERVIEW, size of H increases 500 times The overhead of replicating H. and H T makes RMM inferior to CPMM for the computation We now give an overview of SystemML Figure 3 a shows. of HH T On the other hand the result of HH T remains the overall architecture of SystemML that consists of four. to be a tiny matrix so the best plan to compute the pre components. multiplication with W is still RMM A cost model and a Language Algorithms in SystemML are written in a high. detailed discussion on choosing between CPMM and RMM level language called Declarative Machine learning Language. will be provided in Section IV DML DML exposes mathematical and linear algebra prim. As shown above the choice of a good execution strategy itives on matrices that are natural to express a large class. depends significantly on data characteristics Pushing this of ML algorithms including linear models PCA PageRank. burden on programmers will have serious implications in terms NMF etc In addition DML supports control constructs such. of scaling both development and execution time This paper as while and for to write complex iterative algorithms Through. takes a step towards addressing this problem program analysis SystemML breaks a DML script into smaller. E XAMPLE OPERATORS IN DML xij yij AND zij ARE CELLS IN MATRICES X Y AND Z RESPECTIVELY. Algorithm 1 DML Statement Semantics HOP Notation LOP Notation. Z X Y Z X Y cell wise multiplication zij xij yij b X Y group binary. Z X Y Z X Y cell wise division zij xij P,yij b P X Y group binary P. Z XY Z X Y matrix multiplication zij k xik ykj ab X Y mmrj or mmcj group aggregate. Z XT Z t X transpose zij xji r T X transform t,Z log X cell wise logarithm zij.
P log xij u log,P X unary log P, Z rowSum X row wise sums zi j xij au row X transform row group aggregate. units called statement blocks Each statement block separately generated optimization DML does not provide all the. is optimized and executed by subsequent components flexibility available in R However this loss in flexibility. High Level Operator Component HOP The HOP com results largely in loss in programming convenience and does. ponent analyzes all the operations within a statement block not significantly impact the class of ML algorithms that are. and chooses from multiple high level execution plans A expressible in DML The GNMF algorithm Algorithm 1. plan is represented in a HOP Dag a directed acyclic graph is expressed in DML syntax in Script 1 We explain DML. of basic operations called hops over matrices and scalars constructs using this example. Optimizations considered in this component include algebraic. rewrites selection of the physical representation for interme Script 1 GNMF. 1 V readMM in V rows 1e8 cols 1e5 nnzs 1e10, diate matrices and cost based optimizations 2 W readMM in W rows 1e8 cols 10. Low Level Operator Component LOP The LOP compo 3 H readMM in H rows 10 cols 1e5. nent translates the high level execution plans provided by the 4 max iteration 20. HOP component into low level physical plans on MapReduce 6 while i max iteration. represented as LOP Dags Each low level operator lop in a 7 H H t W V t W W H. LOP Dag operates on key value pairs or scalars The LOP Dag 8 W W V t H W H t H. is then compiled into one or more MapReduce jobs by packing 10 writeMM W out W. multiple lops into MapReduce jobs to keep the number of data 11 writeMM H out H. scans small We refer to this strategy as piggybacking. Runtime The runtime component executes the low level Data Types DML supports two main data types matrices. plans obtained from the LOP component on Hadoop The and scalars 5 Scalar data types supported are integer double. main execution engine in SystemML is a generic MapReduce string and logical The cells in a matrix may consist of integer. job which can be instructed to execute multiple lops inside double string or logical values. it A control module orchestrates the execution of different Statements A DML program consists of a sequence of. instances of the generic MapReduce job Multiple optimiza statements with the default computation semantics being. tions are performed in the runtime component e g execution sequential evaluation of the individual statements. plans for individual lops are decided dynamically based on The following constructs are currently supported in DML. data characteristics such as sparsity of the input matrices Input Output ReadMM and WriteMM statements are pro. Figure 3 b shows how a single DML statement vided for respectively reading and writing matrices from and. A B C D is processed in SystemML The language ex to files Optionally in the ReadMM statement the user can. pression consists of untyped variables and is translated into a provide additional properties of the matrix such as sparsity. HOP Dag consisting of a cell wise division hop and a cell number of non zero entries or nnzs. wise multiplication hop on matrices A lower level execution Control Structures Control structures supported in DML. plan is then generated for this expression as shown in the LOP include the while statement for statement and if statement. Dag Here the Cell Wise Binary Divide hop is translated into Steps 6 9 in Script 1 show an example while statement. two lops a Group lop that sorts key value pairs to align Assignment An assignment statement consists of an expres. the cells from C and D followed by the lop Binary Divide sion and the result of which is assigned to a variable e g. on Each Group Finally the entire LOP Dag is translated into Steps 7 8 and 9 in Script 1 Note that the assignment can be. a single MapReduce job where a the mapper reads three to a scalar or a matrix. inputs b all groupings are performed implicitly between Table I lists several example operators allowed in expres. the mapper and the reducer and c the reducer performs the sions in DML The arithmetic operators extend. division followed by the multiplication naturally to matrices where the semantics is such that the. operator is applied to the corresponding cells For instance. III S YSTEM ML C OMPONENTS the expression Z X Y will multiply the values in the. A Declarative Machine learning Language DML corresponding cells in X and Y and populate the appropriate. DML is a declarative language whose syntax closely cell in Z with the result Several internal functions specific to. resembles the syntax of R4 16 To enable more system particular data types are supported e g rowSum computes. 4R is prototypical for a larger class of such languages including Matlab 15 5 We treat vectors as a special case of matrices. Statement Block SB1 Live Variables In None,1 V readMM in V rows 1e8 cols 1e5 nnzs 1e10. 2 W readMM in W rows 1e8 cols 10,3 H readMM in H rows 10 cols 1e5. 4 max iteration 20,Live Variables Out,W refers to output of.
A B C D Matrix W H V, Scalar i max iteration Live Variables In Step 2 or Step 8 of. matrix i j bij cij dij Matrix W H V previous iteration and is. DML Statement Block SB2 Scalar i max iteration a 108 x 10 matrix. Binary Divide,Script Cell Wise, Binary Multiply On Each Group 6 while i max iteration. i j bij cij dij 7 H H t W V t W W H,matrix matrix 8 W W V t H W H t H. Language Group,Reduce 9 i i 1,B i j cij dij,HOP Component Cell Wise. Binary Divide i j bij Binary Divide Live Variables Out. LOP Component MAP,On Each Group Matrix W H V,Runtime matrix matrix Scalar i max iteration.
i j cij dij Live Variables In,Hadoop C D MR Job Matrix W H. Group Statement Block SB3,10 writeMM W result W, i j cij i j dij 11 writeMM H result H H refers to output of Step 7. Language HOP Component LOP Component Runtime and is a 10 x 105 matrix. Live Variables Out None, Fig 3 a SystemML Architecture b Evaluation of A B C D conceptually each key value pair contains the index and the value of a cell in the matrix. c Program Analysis, the sum of every row in a matrix and returns a column matrix current statement block Live Variables Out The results of. i e a vector while t computes the transpose of a matrix live variable analysis are shown in Figure 3 c. DML also allows users to define their own functions using. B High Level Operator Component HOP, the syntax function arglist body Here the arglist consists.
of a set of formal input and output arguments and the body is The HOP component takes the parsed representation of a. a group of valid DML statements statement block as input and produces a HOP Dag represent. Comparison with R programming language As pointed out ing the data flow. before we have made some choices in the design of DML to Description of hops Each hop in the HOP Dag has one. better enable system optimizations For example DML does or more input s performs an operation or transformation. not support object oriented features advanced data types such and produces output that is consumed by one or more sub. as lists and arrays and advanced function support such as sequent hops Table II lists some example hops supported. accessing variables in the caller function and further up in the in SystemML along with their semantics6 In addition the. call stack Besides these advanced features for programming instantiation of hops from the DML parsed representation. convenience R also supports extensive graphical procedures is exemplified in Table I Consider the matrix multiplication. that are clearly beyond the scope of DML Z X Y as an instance an AggregateBinary hop is instan. Program Analysis We now describe the sequence of steps a tiated with the binary operation and the aggregate operation. DML script goes through to generate a parsed representation The sema. plementations of machine learning ML algorithms on very large datasets ranging from 100s of GBs to TBs of data 1 This requirement is driven by applications such as social media analytics web search computational advertising and recommender systems Previous attempts at building scalable machine learning algorithms have largely been hand

Related Books