From: Jairo Cugliari Date: Tue, 13 Dec 2016 11:50:20 +0000 (+0100) Subject: Lecture du code X-Git-Url: https://git.auder.net/?p=epclust.git;a=commitdiff_plain;h=b81706231c887d593a0eed89940427b577c57ff0 Lecture du code --- diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..783cd5b --- /dev/null +++ b/.gitignore @@ -0,0 +1,2 @@ +*.swp + diff --git a/code/README b/code/README index 67ce9a7..6a539af 100644 --- a/code/README +++ b/code/README @@ -1,15 +1,39 @@ -To compile source code : +0) Download & compile Benjamin's specific library + + git clone git@auder.net:cgds + cd cgds + bash makeMakefile.sh src + make src + sudo make install + +Make sure that the install destination is on the LD_LIBRARY_PATH environment variable. + +1) Compile source code for 1st stage clustering mkdir -p build/stage1/src cd build/stage1/src cmake ../../../stage1/src make - #repeat previous lines for stage 2 + +2) #repeat previous lines for stage 2 ??? + + +NOTA: Need to have openmpi, mpich (compiler for mpi) libxml, and libgsl installed. + Usage (stage 1) : +Serialize input data using + + ppam.exe serialize inputfile_edf outputfile_edf 1 0 + +# 1 indicates data is by column +# 0 means process all the rows + mpirun -np nbProcess ppam.exe cluster ifilename nbSeriesInChunk nbClusters randomize p_for_dissims +## ex. > mpirun -np 4 ./ppam.exe cluster ~/tmp/2009.bin 5000 200 1 2 + Where : nbProcess = number of simultaneous processes ifilename = path to serialized dataset (read below) @@ -18,6 +42,12 @@ Where : randomize = 1 to dispatch time-series at random. 0 to process them in order p_for_dissims = the 'p' of L_p distance used to compute dissimilarities + +The results are stored in ppamResult.xml (curves ids and ranks) while ppamFinalSeries.bin +are the curves used in the last clustering step. The ranks in ppamResult.xml refer to the +curves in ppamFinalSeries.bin + + Note : custom [de]serialization. Consider writing your own in src/TimeSeries/ folder if you plan to test the package.