after merge
[talweg.git] / reports / report.ipynb
CommitLineData
ff5df8e3
BA
1{
2 "cells": [
3 {
4 "cell_type": "markdown",
5 "metadata": {},
6 "source": [
7 "\n",
8 "\n",
9 "<h2>Introduction</h2>\n",
10 "\n",
11 "J'ai fait quelques essais dans différentes configurations pour la méthode \"Neighbors\"\n",
12 "(la seule dont on a parlé).<br>Il semble que le mieux soit\n",
13 "\n",
14 " * simtype=\"exo\" ou \"mix\" : similarités exogènes avec/sans endogènes (fenêtre optimisée par VC)\n",
15 " * same_season=FALSE : les indices pour la validation croisée ne tiennent pas compte des saisons\n",
16 " * mix_strategy=\"mult\" : on multiplie les poids (au lieu d'en éteindre)\n",
17 "\n",
18 "J'ai systématiquement comparé à une approche naïve : la moyennes des lendemains des jours\n",
19 "\"similaires\" dans tout le passé ; à chaque fois sans prédiction du saut (sauf pour Neighbors :\n",
20 "prédiction basée sur les poids calculés).\n",
21 "\n",
22 "Ensuite j'affiche les erreurs, quelques courbes prévues/mesurées, quelques filaments puis les\n",
23 "histogrammes de quelques poids. Concernant les graphes de filaments, la moitié gauche du graphe\n",
24 "correspond aux jours similaires au jour courant, tandis que la moitié droite affiche les\n",
25 "lendemains : ce sont donc les voisinages tels qu'utilisés dans l'algorithme.\n",
26 "\n"
27 ]
28 },
29 {
30 "cell_type": "code",
31 "execution_count": null,
55639673 32 "metadata": {},
ff5df8e3
BA
33 "outputs": [],
34 "source": [
35 "library(talweg)\n",
36 "\n",
37 "ts_data = read.csv(system.file(\"extdata\",\"pm10_mesures_H_loc_report.csv\",package=\"talweg\"))\n",
38 "exo_data = read.csv(system.file(\"extdata\",\"meteo_extra_noNAs.csv\",package=\"talweg\"))\n",
55639673
BA
39 "data = getData(ts_data, exo_data, input_tz = \"Europe/Paris\", working_tz=\"Europe/Paris\",\n",
40 "\tpredict_at=7) #predict from P+1 to P+H included\n",
ff5df8e3
BA
41 "\n",
42 "indices_ch = seq(as.Date(\"2015-01-18\"),as.Date(\"2015-01-24\"),\"days\")\n",
43 "indices_ep = seq(as.Date(\"2015-03-15\"),as.Date(\"2015-03-21\"),\"days\")\n",
55639673 44 "indices_np = seq(as.Date(\"2015-04-26\"),as.Date(\"2015-05-02\"),\"days\")\n"
ff5df8e3
BA
45 ]
46 },
47 {
48 "cell_type": "markdown",
49 "metadata": {},
50 "source": [
51 "\n",
52 "\n",
53 "<h2 style=\"color:blue;font-size:2em\">Pollution par chauffage</h2>"
54 ]
55 },
56 {
57 "cell_type": "code",
58 "execution_count": null,
55639673 59 "metadata": {},
ff5df8e3
BA
60 "outputs": [],
61 "source": [
55639673
BA
62 "p_nn_exo = computeForecast(data, indices_ch, \"Neighbors\", \"Neighbors\",\n",
63 "\thorizon=3, simtype=\"exo\")\n",
64 "p_nn_mix = computeForecast(data, indices_ch, \"Neighbors\", \"Neighbors\",\n",
65 "\thorizon=3, simtype=\"mix\")\n",
66 "p_az = computeForecast(data, indices_ch, \"Average\", \"Zero\",\n",
67 "\thorizon=3)\n",
68 "p_pz = computeForecast(data, indices_ch, \"Persistence\", \"Zero\",\n",
69 "\thorizon=3, same_day=TRUE)"
ff5df8e3
BA
70 ]
71 },
72 {
73 "cell_type": "code",
74 "execution_count": null,
55639673 75 "metadata": {},
ff5df8e3
BA
76 "outputs": [],
77 "source": [
55639673
BA
78 "e_nn_exo = computeError(data, p_nn_exo, 3)\n",
79 "e_nn_mix = computeError(data, p_nn_mix, 3)\n",
80 "e_az = computeError(data, p_az, 3)\n",
81 "e_pz = computeError(data, p_pz, 3)\n",
ff5df8e3
BA
82 "options(repr.plot.width=9, repr.plot.height=7)\n",
83 "plotError(list(e_nn_mix, e_pz, e_az, e_nn_exo), cols=c(1,2,colors()[258], 4))\n",
84 "\n",
85 "# Noir: neighbors_mix, bleu: neighbors_exo, vert: moyenne, rouge: persistence\n",
86 "\n",
87 "i_np = which.min(e_nn_exo$abs$indices)\n",
88 "i_p = which.max(e_nn_exo$abs$indices)"
89 ]
90 },
91 {
92 "cell_type": "code",
93 "execution_count": null,
55639673 94 "metadata": {},
ff5df8e3
BA
95 "outputs": [],
96 "source": [
97 "options(repr.plot.width=9, repr.plot.height=4)\n",
98 "par(mfrow=c(1,2))\n",
99 "\n",
100 "plotPredReal(data, p_nn_exo, i_np); title(paste(\"PredReal nn exo day\",i_np))\n",
101 "plotPredReal(data, p_nn_exo, i_p); title(paste(\"PredReal nn exo day\",i_p))\n",
102 "\n",
103 "plotPredReal(data, p_nn_mix, i_np); title(paste(\"PredReal nn mix day\",i_np))\n",
104 "plotPredReal(data, p_nn_mix, i_p); title(paste(\"PredReal nn mix day\",i_p))\n",
105 "\n",
106 "plotPredReal(data, p_az, i_np); title(paste(\"PredReal az day\",i_np))\n",
107 "plotPredReal(data, p_az, i_p); title(paste(\"PredReal az day\",i_p))\n",
108 "\n",
109 "# Bleu: prévue, noir: réalisée"
110 ]
111 },
112 {
113 "cell_type": "code",
114 "execution_count": null,
55639673 115 "metadata": {},
ff5df8e3
BA
116 "outputs": [],
117 "source": [
118 "par(mfrow=c(1,2))\n",
119 "f_np_exo = computeFilaments(data, p_nn_exo, i_np, plot=TRUE); title(paste(\"Filaments nn exo day\",i_np))\n",
120 "f_p_exo = computeFilaments(data, p_nn_exo, i_p, plot=TRUE); title(paste(\"Filaments nn exo day\",i_p))\n",
121 "\n",
122 "f_np_mix = computeFilaments(data, p_nn_mix, i_np, plot=TRUE); title(paste(\"Filaments nn mix day\",i_np))\n",
123 "f_p_mix = computeFilaments(data, p_nn_mix, i_p, plot=TRUE); title(paste(\"Filaments nn mix day\",i_p))"
124 ]
125 },
126 {
127 "cell_type": "code",
128 "execution_count": null,
55639673 129 "metadata": {},
ff5df8e3
BA
130 "outputs": [],
131 "source": [
132 "par(mfrow=c(1,2))\n",
133 "plotFilamentsBox(data, f_np_exo); title(paste(\"FilBox nn exo day\",i_np))\n",
134 "plotFilamentsBox(data, f_p_exo); title(paste(\"FilBox nn exo day\",i_p))\n",
135 "\n",
136 "plotFilamentsBox(data, f_np_mix); title(paste(\"FilBox nn mix day\",i_np))\n",
137 "plotFilamentsBox(data, f_p_mix); title(paste(\"FilBox nn mix day\",i_p))"
138 ]
139 },
140 {
141 "cell_type": "code",
142 "execution_count": null,
55639673 143 "metadata": {},
ff5df8e3
BA
144 "outputs": [],
145 "source": [
146 "par(mfrow=c(1,2))\n",
147 "plotRelVar(data, f_np_exo); title(paste(\"StdDev nn exo day\",i_np))\n",
148 "plotRelVar(data, f_p_exo); title(paste(\"StdDev nn exo day\",i_p))\n",
149 "\n",
150 "plotRelVar(data, f_np_mix); title(paste(\"StdDev nn mix day\",i_np))\n",
151 "plotRelVar(data, f_p_mix); title(paste(\"StdDev nn mix day\",i_p))\n",
152 "\n",
153 "# Variabilité globale en rouge ; sur les 60 voisins (+ lendemains) en noir"
154 ]
155 },
156 {
157 "cell_type": "code",
158 "execution_count": null,
55639673 159 "metadata": {},
ff5df8e3
BA
160 "outputs": [],
161 "source": [
162 "par(mfrow=c(1,2))\n",
163 "plotSimils(p_nn_exo, i_np); title(paste(\"Weights nn exo day\",i_np))\n",
164 "plotSimils(p_nn_exo, i_p); title(paste(\"Weights nn exo day\",i_p))\n",
165 "\n",
166 "plotSimils(p_nn_mix, i_np); title(paste(\"Weights nn mix day\",i_np))\n",
d4841a3f 167 "plotSimils(p_nn_mix, i_p); title(paste(\"Weights nn mix day\",i_p))\n",
ff5df8e3
BA
168 "\n",
169 "# - pollué à gauche, + pollué à droite"
170 ]
171 },
172 {
173 "cell_type": "code",
174 "execution_count": null,
55639673 175 "metadata": {},
ff5df8e3
BA
176 "outputs": [],
177 "source": [
178 "# Fenêtres sélectionnées dans ]0,10] / endo à gauche, exo à droite\n",
179 "p_nn_exo$getParams(i_np)$window\n",
180 "p_nn_exo$getParams(i_p)$window\n",
181 "\n",
182 "p_nn_mix$getParams(i_np)$window\n",
183 "p_nn_mix$getParams(i_p)$window"
184 ]
185 },
186 {
187 "cell_type": "markdown",
188 "metadata": {},
189 "source": [
190 "\n",
191 "\n",
192 "<h2 style=\"color:blue;font-size:2em\">Pollution par épandage</h2>"
193 ]
194 },
195 {
196 "cell_type": "code",
197 "execution_count": null,
55639673 198 "metadata": {},
ff5df8e3
BA
199 "outputs": [],
200 "source": [
55639673
BA
201 "p_nn_exo = computeForecast(data, indices_ep, \"Neighbors\", \"Neighbors\",\n",
202 "\thorizon=3, simtype=\"exo\")\n",
203 "p_nn_mix = computeForecast(data, indices_ep, \"Neighbors\", \"Neighbors\",\n",
204 "\thorizon=3, simtype=\"mix\")\n",
205 "p_az = computeForecast(data, indices_ep, \"Average\", \"Zero\",\n",
206 "\thorizon=3)\n",
207 "p_pz = computeForecast(data, indices_ep, \"Persistence\", \"Zero\",\n",
208 "\thorizon=3, same_day=TRUE)"
ff5df8e3
BA
209 ]
210 },
211 {
212 "cell_type": "code",
213 "execution_count": null,
55639673 214 "metadata": {},
ff5df8e3
BA
215 "outputs": [],
216 "source": [
55639673
BA
217 "e_nn_exo = computeError(data, p_nn_exo, 3)\n",
218 "e_nn_mix = computeError(data, p_nn_mix, 3)\n",
219 "e_az = computeError(data, p_az, 3)\n",
220 "e_pz = computeError(data, p_pz, 3)\n",
ff5df8e3
BA
221 "options(repr.plot.width=9, repr.plot.height=7)\n",
222 "plotError(list(e_nn_mix, e_pz, e_az, e_nn_exo), cols=c(1,2,colors()[258], 4))\n",
223 "\n",
224 "# Noir: neighbors_mix, bleu: neighbors_exo, vert: moyenne, rouge: persistence\n",
225 "\n",
226 "i_np = which.min(e_nn_exo$abs$indices)\n",
227 "i_p = which.max(e_nn_exo$abs$indices)"
228 ]
229 },
230 {
231 "cell_type": "code",
232 "execution_count": null,
55639673 233 "metadata": {},
ff5df8e3
BA
234 "outputs": [],
235 "source": [
236 "options(repr.plot.width=9, repr.plot.height=4)\n",
237 "par(mfrow=c(1,2))\n",
238 "\n",
239 "plotPredReal(data, p_nn_exo, i_np); title(paste(\"PredReal nn exo day\",i_np))\n",
240 "plotPredReal(data, p_nn_exo, i_p); title(paste(\"PredReal nn exo day\",i_p))\n",
241 "\n",
242 "plotPredReal(data, p_nn_mix, i_np); title(paste(\"PredReal nn mix day\",i_np))\n",
243 "plotPredReal(data, p_nn_mix, i_p); title(paste(\"PredReal nn mix day\",i_p))\n",
244 "\n",
245 "plotPredReal(data, p_az, i_np); title(paste(\"PredReal az day\",i_np))\n",
246 "plotPredReal(data, p_az, i_p); title(paste(\"PredReal az day\",i_p))\n",
247 "\n",
248 "# Bleu: prévue, noir: réalisée"
249 ]
250 },
251 {
252 "cell_type": "code",
253 "execution_count": null,
55639673 254 "metadata": {},
ff5df8e3
BA
255 "outputs": [],
256 "source": [
257 "par(mfrow=c(1,2))\n",
258 "f_np_exo = computeFilaments(data, p_nn_exo, i_np, plot=TRUE); title(paste(\"Filaments nn exo day\",i_np))\n",
259 "f_p_exo = computeFilaments(data, p_nn_exo, i_p, plot=TRUE); title(paste(\"Filaments nn exo day\",i_p))\n",
260 "\n",
261 "f_np_mix = computeFilaments(data, p_nn_mix, i_np, plot=TRUE); title(paste(\"Filaments nn mix day\",i_np))\n",
262 "f_p_mix = computeFilaments(data, p_nn_mix, i_p, plot=TRUE); title(paste(\"Filaments nn mix day\",i_p))"
263 ]
264 },
265 {
266 "cell_type": "code",
267 "execution_count": null,
55639673 268 "metadata": {},
ff5df8e3
BA
269 "outputs": [],
270 "source": [
271 "par(mfrow=c(1,2))\n",
272 "plotFilamentsBox(data, f_np_exo); title(paste(\"FilBox nn exo day\",i_np))\n",
273 "plotFilamentsBox(data, f_p_exo); title(paste(\"FilBox nn exo day\",i_p))\n",
274 "\n",
275 "plotFilamentsBox(data, f_np_mix); title(paste(\"FilBox nn mix day\",i_np))\n",
276 "plotFilamentsBox(data, f_p_mix); title(paste(\"FilBox nn mix day\",i_p))"
277 ]
278 },
279 {
280 "cell_type": "code",
281 "execution_count": null,
55639673 282 "metadata": {},
ff5df8e3
BA
283 "outputs": [],
284 "source": [
285 "par(mfrow=c(1,2))\n",
286 "plotRelVar(data, f_np_exo); title(paste(\"StdDev nn exo day\",i_np))\n",
287 "plotRelVar(data, f_p_exo); title(paste(\"StdDev nn exo day\",i_p))\n",
288 "\n",
289 "plotRelVar(data, f_np_mix); title(paste(\"StdDev nn mix day\",i_np))\n",
290 "plotRelVar(data, f_p_mix); title(paste(\"StdDev nn mix day\",i_p))\n",
291 "\n",
292 "# Variabilité globale en rouge ; sur les 60 voisins (+ lendemains) en noir"
293 ]
294 },
295 {
296 "cell_type": "code",
297 "execution_count": null,
55639673 298 "metadata": {},
ff5df8e3
BA
299 "outputs": [],
300 "source": [
301 "par(mfrow=c(1,2))\n",
302 "plotSimils(p_nn_exo, i_np); title(paste(\"Weights nn exo day\",i_np))\n",
303 "plotSimils(p_nn_exo, i_p); title(paste(\"Weights nn exo day\",i_p))\n",
304 "\n",
305 "plotSimils(p_nn_mix, i_np); title(paste(\"Weights nn mix day\",i_np))\n",
d4841a3f 306 "plotSimils(p_nn_mix, i_p); title(paste(\"Weights nn mix day\",i_p))\n",
ff5df8e3
BA
307 "\n",
308 "# - pollué à gauche, + pollué à droite"
309 ]
310 },
311 {
312 "cell_type": "code",
313 "execution_count": null,
55639673 314 "metadata": {},
ff5df8e3
BA
315 "outputs": [],
316 "source": [
317 "# Fenêtres sélectionnées dans ]0,10] / endo à gauche, exo à droite\n",
318 "p_nn_exo$getParams(i_np)$window\n",
319 "p_nn_exo$getParams(i_p)$window\n",
320 "\n",
321 "p_nn_mix$getParams(i_np)$window\n",
322 "p_nn_mix$getParams(i_p)$window"
323 ]
324 },
325 {
326 "cell_type": "markdown",
327 "metadata": {},
328 "source": [
329 "\n",
330 "\n",
331 "<h2 style=\"color:blue;font-size:2em\">Semaine non polluée</h2>"
332 ]
333 },
334 {
335 "cell_type": "code",
336 "execution_count": null,
55639673 337 "metadata": {},
ff5df8e3
BA
338 "outputs": [],
339 "source": [
55639673
BA
340 "p_nn_exo = computeForecast(data, indices_np, \"Neighbors\", \"Neighbors\",\n",
341 "\thorizon=3, simtype=\"exo\")\n",
342 "p_nn_mix = computeForecast(data, indices_np, \"Neighbors\", \"Neighbors\",\n",
343 "\thorizon=3, simtype=\"mix\")\n",
344 "p_az = computeForecast(data, indices_np, \"Average\", \"Zero\",\n",
345 "\thorizon=3)\n",
346 "p_pz = computeForecast(data, indices_np, \"Persistence\", \"Zero\",\n",
347 "\thorizon=3, same_day=FALSE)"
ff5df8e3
BA
348 ]
349 },
350 {
351 "cell_type": "code",
352 "execution_count": null,
55639673 353 "metadata": {},
ff5df8e3
BA
354 "outputs": [],
355 "source": [
55639673
BA
356 "e_nn_exo = computeError(data, p_nn_exo, 3)\n",
357 "e_nn_mix = computeError(data, p_nn_mix, 3)\n",
358 "e_az = computeError(data, p_az, 3)\n",
359 "e_pz = computeError(data, p_pz, 3)\n",
ff5df8e3
BA
360 "options(repr.plot.width=9, repr.plot.height=7)\n",
361 "plotError(list(e_nn_mix, e_pz, e_az, e_nn_exo), cols=c(1,2,colors()[258], 4))\n",
362 "\n",
363 "# Noir: neighbors_mix, bleu: neighbors_exo, vert: moyenne, rouge: persistence\n",
364 "\n",
365 "i_np = which.min(e_nn_exo$abs$indices)\n",
366 "i_p = which.max(e_nn_exo$abs$indices)"
367 ]
368 },
369 {
370 "cell_type": "code",
371 "execution_count": null,
55639673 372 "metadata": {},
ff5df8e3
BA
373 "outputs": [],
374 "source": [
375 "options(repr.plot.width=9, repr.plot.height=4)\n",
376 "par(mfrow=c(1,2))\n",
377 "\n",
378 "plotPredReal(data, p_nn_exo, i_np); title(paste(\"PredReal nn exo day\",i_np))\n",
379 "plotPredReal(data, p_nn_exo, i_p); title(paste(\"PredReal nn exo day\",i_p))\n",
380 "\n",
381 "plotPredReal(data, p_nn_mix, i_np); title(paste(\"PredReal nn mix day\",i_np))\n",
382 "plotPredReal(data, p_nn_mix, i_p); title(paste(\"PredReal nn mix day\",i_p))\n",
383 "\n",
384 "plotPredReal(data, p_az, i_np); title(paste(\"PredReal az day\",i_np))\n",
385 "plotPredReal(data, p_az, i_p); title(paste(\"PredReal az day\",i_p))\n",
386 "\n",
387 "# Bleu: prévue, noir: réalisée"
388 ]
389 },
390 {
391 "cell_type": "code",
392 "execution_count": null,
55639673 393 "metadata": {},
ff5df8e3
BA
394 "outputs": [],
395 "source": [
396 "par(mfrow=c(1,2))\n",
397 "f_np_exo = computeFilaments(data, p_nn_exo, i_np, plot=TRUE); title(paste(\"Filaments nn exo day\",i_np))\n",
398 "f_p_exo = computeFilaments(data, p_nn_exo, i_p, plot=TRUE); title(paste(\"Filaments nn exo day\",i_p))\n",
399 "\n",
400 "f_np_mix = computeFilaments(data, p_nn_mix, i_np, plot=TRUE); title(paste(\"Filaments nn mix day\",i_np))\n",
401 "f_p_mix = computeFilaments(data, p_nn_mix, i_p, plot=TRUE); title(paste(\"Filaments nn mix day\",i_p))"
402 ]
403 },
404 {
405 "cell_type": "code",
406 "execution_count": null,
55639673 407 "metadata": {},
ff5df8e3
BA
408 "outputs": [],
409 "source": [
410 "par(mfrow=c(1,2))\n",
411 "plotFilamentsBox(data, f_np_exo); title(paste(\"FilBox nn exo day\",i_np))\n",
412 "plotFilamentsBox(data, f_p_exo); title(paste(\"FilBox nn exo day\",i_p))\n",
413 "\n",
414 "plotFilamentsBox(data, f_np_mix); title(paste(\"FilBox nn mix day\",i_np))\n",
415 "plotFilamentsBox(data, f_p_mix); title(paste(\"FilBox nn mix day\",i_p))"
416 ]
417 },
418 {
419 "cell_type": "code",
420 "execution_count": null,
55639673 421 "metadata": {},
ff5df8e3
BA
422 "outputs": [],
423 "source": [
424 "par(mfrow=c(1,2))\n",
425 "plotRelVar(data, f_np_exo); title(paste(\"StdDev nn exo day\",i_np))\n",
426 "plotRelVar(data, f_p_exo); title(paste(\"StdDev nn exo day\",i_p))\n",
427 "\n",
428 "plotRelVar(data, f_np_mix); title(paste(\"StdDev nn mix day\",i_np))\n",
429 "plotRelVar(data, f_p_mix); title(paste(\"StdDev nn mix day\",i_p))\n",
430 "\n",
431 "# Variabilité globale en rouge ; sur les 60 voisins (+ lendemains) en noir"
432 ]
433 },
434 {
435 "cell_type": "code",
436 "execution_count": null,
55639673 437 "metadata": {},
ff5df8e3
BA
438 "outputs": [],
439 "source": [
440 "par(mfrow=c(1,2))\n",
441 "plotSimils(p_nn_exo, i_np); title(paste(\"Weights nn exo day\",i_np))\n",
442 "plotSimils(p_nn_exo, i_p); title(paste(\"Weights nn exo day\",i_p))\n",
443 "\n",
444 "plotSimils(p_nn_mix, i_np); title(paste(\"Weights nn mix day\",i_np))\n",
d4841a3f 445 "plotSimils(p_nn_mix, i_p); title(paste(\"Weights nn mix day\",i_p))\n",
ff5df8e3
BA
446 "\n",
447 "# - pollué à gauche, + pollué à droite"
448 ]
449 },
450 {
451 "cell_type": "code",
452 "execution_count": null,
55639673 453 "metadata": {},
ff5df8e3
BA
454 "outputs": [],
455 "source": [
456 "# Fenêtres sélectionnées dans ]0,10] / endo à gauche, exo à droite\n",
457 "p_nn_exo$getParams(i_np)$window\n",
458 "p_nn_exo$getParams(i_p)$window\n",
459 "\n",
460 "p_nn_mix$getParams(i_np)$window\n",
461 "p_nn_mix$getParams(i_p)$window"
462 ]
463 },
464 {
465 "cell_type": "markdown",
466 "metadata": {},
467 "source": [
468 "\n",
469 "\n",
470 "<h2>Bilan</h2>\n",
471 "\n",
472 "Problème difficile : on ne fait guère mieux qu'une naïve moyenne des lendemains des jours\n",
473 "similaires dans le passé, ce qui n'est pas loin de prédire une série constante égale à la\n",
474 "dernière valeur observée (méthode \"zéro\"). La persistence donne parfois de bons résultats\n",
475 "mais est trop instable (sensibilité à l'argument <code>same_day</code>).\n",
476 "\n",
477 "Comment améliorer la méthode ?"
478 ]
479 }
480 ],
481 "metadata": {
482 "kernelspec": {
483 "display_name": "R",
484 "language": "R",
485 "name": "ir"
486 },
487 "language_info": {
488 "codemirror_mode": "r",
489 "file_extension": ".r",
490 "mimetype": "text/x-r-source",
491 "name": "R",
492 "pygments_lexer": "r",
493 "version": "3.3.3"
494 }
495 },
496 "nbformat": 4,
497 "nbformat_minor": 2
498}