complete first draft of package
[epclust.git] / old_C_code / README
CommitLineData
b8170623
JC
10) Download & compile Benjamin's specific library
2
3 git clone git@auder.net:cgds
4 cd cgds
5 bash makeMakefile.sh src
6 make src
7 sudo make install
8
9Make sure that the install destination is on the LD_LIBRARY_PATH environment variable.
10
111) Compile source code for 1st stage clustering
ab4a34ef 12
e00da896
BA
13 mkdir -p build/stage1/src
14 cd build/stage1/src
4b7107ce 15 cmake ../../../stage1/src
ab4a34ef 16 make
b8170623
JC
17
182) #repeat previous lines for stage 2 ???
19
20
21NOTA: Need to have openmpi, mpich (compiler for mpi) libxml, and libgsl installed.
22
ab4a34ef 23
e00da896 24Usage (stage 1) :
ab4a34ef 25
b8170623
JC
26Serialize input data using
27
28 ppam.exe serialize inputfile_edf outputfile_edf 1 0
29
30# 1 indicates data is by column
31# 0 means process all the rows
32
311c5c07 33 mpirun -np nbProcess ppam.exe cluster ifilename nbSeriesInChunk nbClusters randomize p_for_dissims
ab4a34ef 34
b8170623
JC
35## ex. > mpirun -np 4 ./ppam.exe cluster ~/tmp/2009.bin 5000 200 1 2
36
ab4a34ef 37Where :
311c5c07 38 nbProcess = number of simultaneous processes
ab4a34ef
BA
39 ifilename = path to serialized dataset (read below)
40 nbSeriesInChunk = number of time-series to process sequentially
41 nbClusters = number of clusters
42 randomize = 1 to dispatch time-series at random. 0 to process them in order
43 p_for_dissims = the 'p' of L_p distance used to compute dissimilarities
44
b8170623
JC
45
46The results are stored in ppamResult.xml (curves ids and ranks) while ppamFinalSeries.bin
47are the curves used in the last clustering step. The ranks in ppamResult.xml refer to the
48curves in ppamFinalSeries.bin
49
50
ab4a34ef
BA
51Note : custom [de]serialization. Consider writing your own
52in src/TimeSeries/ folder if you plan to test the package.
53
54See also src/main.c for the details.