Simplify plots: version OK with R6 classes
[talweg.git] / reports / report_2017-03-01.7h_average.ipynb
CommitLineData
fa8078f9
BA
1{
2 "cells": [
3 {
4 "cell_type": "code",
98e958ca 5 "execution_count": null,
fa8078f9
BA
6 "metadata": {
7 "collapsed": false
8 },
98e958ca 9 "outputs": [],
fa8078f9
BA
10 "source": [
11 "library(talweg)"
12 ]
13 },
14 {
15 "cell_type": "code",
98e958ca 16 "execution_count": null,
fa8078f9
BA
17 "metadata": {
18 "collapsed": false
19 },
af3b84f4 20 "outputs": [],
fa8078f9 21 "source": [
99f83c9a
BA
22 "ts_data = read.csv(system.file(\"extdata\",\"pm10_mesures_H_loc.csv\",package=\"talweg\"))\n",
23 "exo_data = read.csv(system.file(\"extdata\",\"meteo_extra_noNAs.csv\",package=\"talweg\"))\n",
24 "data = getData(ts_data, exo_data, input_tz = \"Europe/Paris\", working_tz=\"Europe/Paris\", predict_at=7)"
fa8078f9
BA
25 ]
26 },
27 {
28 "cell_type": "markdown",
29 "metadata": {},
30 "source": [
31 "## Introduction\n",
32 "\n",
33 "J'ai fait quelques essais dans différentes configurations pour la méthode \"Neighbors\" (la seule dont on a parlé).<br>Il semble que le mieux soit\n",
34 "\n",
35 " * simtype=\"mix\" : on utilise les similarités endogènes et exogènes (fenêtre optimisée par VC)\n",
36 " * same_season=FALSE : les indices pour la validation croisée ne tiennent pas compte des saisons\n",
37 " * mix_strategy=\"mult\" : on multiplie les poids (au lieu d'en éteindre)\n",
38 "\n",
99f83c9a
BA
39 "(valeurs par défaut).\n",
40 "\n",
fa8078f9
BA
41 "J'ai systématiquement comparé à deux autres approches : la persistence et la moyennes des lendemains des jours \"similaires\" dans tout le passé ; à chaque fois sans prédiction du saut (sauf pour Neighbors : prédiction basée sur les poids calculés).\n",
42 "\n",
43 "Ensuite j'affiche les erreurs, quelques courbes prévues/mesurées, quelques filaments puis les histogrammes de quelques poids. Concernant les graphes de filaments, la moitié gauche du graphe correspond aux jours similaires au jour courant, tandis que la moitié droite affiche les lendemains : ce sont donc les voisinages tels qu'utilisés dans l'algorithme.\n",
44 "\n",
45 "<h2 style=\"color:blue;font-size:2em\">Pollution par chauffage</h2>"
46 ]
47 },
48 {
49 "cell_type": "code",
98e958ca 50 "execution_count": null,
fa8078f9
BA
51 "metadata": {
52 "collapsed": false
53 },
54 "outputs": [],
55 "source": [
69bcd8bc 56 "indices_ch = seq(as.Date(\"2015-01-18\"),as.Date(\"2015-01-24\"),\"days\")\n",
98e958ca 57 "p_ch_nn = computeForecast(data, indices_ch, \"Neighbors\", \"Neighbors\", simtype=\"mix\")\n",
99f83c9a
BA
58 "p_ch_pz = computeForecast(data, indices_ch, \"Persistence\", \"Zero\", same_day=TRUE)\n",
59 "p_ch_az = computeForecast(data, indices_ch, \"Average\", \"Zero\") #, memory=183)\n",
60 "#p_ch_zz = computeForecast(data, indices_ch, \"Zero\", \"Zero\")"
fa8078f9
BA
61 ]
62 },
63 {
64 "cell_type": "code",
98e958ca 65 "execution_count": null,
fa8078f9
BA
66 "metadata": {
67 "collapsed": false
68 },
98e958ca 69 "outputs": [],
fa8078f9 70 "source": [
99f83c9a
BA
71 "e_ch_nn = computeError(data, p_ch_nn)\n",
72 "e_ch_pz = computeError(data, p_ch_pz)\n",
73 "e_ch_az = computeError(data, p_ch_az)\n",
74 "#e_ch_zz = computeError(data, p_ch_zz)\n",
841b7f5a 75 "options(repr.plot.width=9, repr.plot.height=7)\n",
fa8078f9
BA
76 "plotError(list(e_ch_nn, e_ch_pz, e_ch_az), cols=c(1,2,colors()[258]))\n",
77 "\n",
78 "#Noir: neighbors, rouge: persistence, vert: moyenne"
79 ]
80 },
fa8078f9
BA
81 {
82 "cell_type": "code",
98e958ca 83 "execution_count": null,
fa8078f9
BA
84 "metadata": {
85 "collapsed": false
86 },
98e958ca 87 "outputs": [],
fa8078f9
BA
88 "source": [
89 "par(mfrow=c(1,2))\n",
90 "options(repr.plot.width=9, repr.plot.height=4)\n",
91 "plotPredReal(data, p_ch_nn, 3)\n",
92 "plotPredReal(data, p_ch_nn, 4)\n",
93 "\n",
94 "#Bleu: prévue, noir: réalisée"
95 ]
96 },
fa8078f9
BA
97 {
98 "cell_type": "code",
98e958ca 99 "execution_count": null,
fa8078f9
BA
100 "metadata": {
101 "collapsed": false
102 },
98e958ca 103 "outputs": [],
fa8078f9
BA
104 "source": [
105 "par(mfrow=c(1,2))\n",
106 "plotPredReal(data, p_ch_az, 3)\n",
107 "plotPredReal(data, p_ch_az, 4)"
108 ]
109 },
fa8078f9
BA
110 {
111 "cell_type": "code",
98e958ca 112 "execution_count": null,
fa8078f9
BA
113 "metadata": {
114 "collapsed": false
115 },
98e958ca 116 "outputs": [],
fa8078f9
BA
117 "source": [
118 "par(mfrow=c(1,2))\n",
69bcd8bc
BA
119 "f3_ch = computeFilaments(data, p_ch_nn$getIndexInData(3), plot=TRUE)\n",
120 "f4_ch = computeFilaments(data, p_ch_nn$getIndexInData(4), plot=TRUE)"
fa8078f9
BA
121 ]
122 },
123 {
124 "cell_type": "code",
98e958ca 125 "execution_count": null,
fa8078f9
BA
126 "metadata": {
127 "collapsed": false
128 },
98e958ca 129 "outputs": [],
fa8078f9 130 "source": [
98e958ca
BA
131 "par(mfrow=c(1,2))\n",
132 "plotFilamentsBox(data, f3_ch)\n",
133 "plotFilamentsBox(data, f4_ch)\n",
841b7f5a 134 "\n",
98e958ca 135 "#À gauche : jour 3 + lendemain (4) ; à droite : jour 4 + lendemain (5)"
fa8078f9
BA
136 ]
137 },
138 {
139 "cell_type": "code",
98e958ca 140 "execution_count": null,
fa8078f9
BA
141 "metadata": {
142 "collapsed": false
143 },
98e958ca 144 "outputs": [],
fa8078f9
BA
145 "source": [
146 "par(mfrow=c(1,2))\n",
98e958ca
BA
147 "plotRelVar(data, f3_ch)\n",
148 "plotRelVar(data, f4_ch)\n",
841b7f5a 149 "\n",
98e958ca 150 "#Variabilité globale en rouge ; sur nos 60 voisins (+ lendemains) en noir"
841b7f5a
BA
151 ]
152 },
153 {
154 "cell_type": "code",
98e958ca 155 "execution_count": null,
841b7f5a
BA
156 "metadata": {
157 "collapsed": false
158 },
98e958ca 159 "outputs": [],
841b7f5a 160 "source": [
af3b84f4 161 "par(mfrow=c(1,2))\n",
fa8078f9
BA
162 "plotSimils(p_ch_nn, 3)\n",
163 "plotSimils(p_ch_nn, 4)\n",
164 "\n",
af3b84f4 165 "#Non pollué à gauche, pollué à droite"
fa8078f9
BA
166 ]
167 },
fa8078f9
BA
168 {
169 "cell_type": "code",
98e958ca 170 "execution_count": null,
fa8078f9
BA
171 "metadata": {
172 "collapsed": false
173 },
98e958ca 174 "outputs": [],
99f83c9a
BA
175 "source": [
176 "#Fenêtres sélectionnées dans ]0,10] / endo à gauche, exo à droite\n",
177 "p_ch_nn$getParams(3)$window\n",
178 "p_ch_nn$getParams(4)$window"
179 ]
180 },
fa8078f9
BA
181 {
182 "cell_type": "markdown",
183 "metadata": {},
184 "source": [
185 "<h2 style=\"color:blue;font-size:2em\">Pollution par épandage</h2>"
186 ]
187 },
188 {
189 "cell_type": "code",
98e958ca 190 "execution_count": null,
fa8078f9
BA
191 "metadata": {
192 "collapsed": false
193 },
194 "outputs": [],
195 "source": [
69bcd8bc 196 "indices_ep = seq(as.Date(\"2015-03-15\"),as.Date(\"2015-03-21\"),\"days\")\n",
99f83c9a
BA
197 "p_ep_nn = computeForecast(data,indices_ep, \"Neighbors\", \"Neighbors\", simtype=\"mix\")\n",
198 "p_ep_pz = computeForecast(data, indices_ep, \"Persistence\", \"Zero\", same_day=TRUE)\n",
199 "p_ep_az = computeForecast(data, indices_ep, \"Average\", \"Zero\") #, memory=183)\n",
200 "#p_ep_zz = computeForecast(data, indices_ep, \"Zero\", \"Zero\")"
fa8078f9
BA
201 ]
202 },
203 {
204 "cell_type": "code",
98e958ca 205 "execution_count": null,
fa8078f9
BA
206 "metadata": {
207 "collapsed": false
208 },
98e958ca 209 "outputs": [],
fa8078f9 210 "source": [
99f83c9a
BA
211 "e_ep_nn = computeError(data, p_ep_nn)\n",
212 "e_ep_pz = computeError(data, p_ep_pz)\n",
213 "e_ep_az = computeError(data, p_ep_az)\n",
214 "#e_ep_zz = computeError(data, p_ep_zz)\n",
841b7f5a 215 "options(repr.plot.width=9, repr.plot.height=7)\n",
fa8078f9
BA
216 "plotError(list(e_ep_nn, e_ep_pz, e_ep_az), cols=c(1,2,colors()[258]))\n",
217 "\n",
218 "#Noir: neighbors, rouge: persistence, vert: moyenne"
219 ]
220 },
fa8078f9
BA
221 {
222 "cell_type": "code",
98e958ca 223 "execution_count": null,
fa8078f9
BA
224 "metadata": {
225 "collapsed": false
226 },
98e958ca 227 "outputs": [],
fa8078f9
BA
228 "source": [
229 "par(mfrow=c(1,2))\n",
230 "options(repr.plot.width=9, repr.plot.height=4)\n",
69bcd8bc
BA
231 "plotPredReal(data, p_ep_nn, 4)\n",
232 "plotPredReal(data, p_ep_nn, 6)\n",
fa8078f9
BA
233 "\n",
234 "#Bleu: prévue, noir: réalisée"
235 ]
236 },
fa8078f9
BA
237 {
238 "cell_type": "code",
98e958ca 239 "execution_count": null,
fa8078f9
BA
240 "metadata": {
241 "collapsed": false
242 },
98e958ca 243 "outputs": [],
fa8078f9
BA
244 "source": [
245 "par(mfrow=c(1,2))\n",
99f83c9a
BA
246 "plotPredReal(data, p_ep_az, 4)\n",
247 "plotPredReal(data, p_ep_az, 6)"
fa8078f9
BA
248 ]
249 },
250 {
251 "cell_type": "code",
98e958ca 252 "execution_count": null,
fa8078f9
BA
253 "metadata": {
254 "collapsed": false
255 },
98e958ca 256 "outputs": [],
fa8078f9
BA
257 "source": [
258 "par(mfrow=c(1,2))\n",
69bcd8bc
BA
259 "f4_ep = computeFilaments(data, p_ep_nn$getIndexInData(4), plot=TRUE)\n",
260 "f6_ep = computeFilaments(data, p_ep_nn$getIndexInData(6), plot=TRUE)"
261 ]
262 },
263 {
264 "cell_type": "code",
265 "execution_count": null,
266 "metadata": {
267 "collapsed": false
268 },
269 "outputs": [],
270 "source": [
271 "par(mfrow=c(2,2))\n",
98e958ca
BA
272 "plotFilamentsBox(data, f4_ep)\n",
273 "plotFilamentsBox(data, f6_ep)\n",
69bcd8bc 274 "\n",
98e958ca 275 "#À gauche : jour 4 + lendemain (5) ; à droite : jour 6 + lendemain (7)"
fa8078f9
BA
276 ]
277 },
278 {
279 "cell_type": "code",
280 "execution_count": null,
281 "metadata": {
282 "collapsed": false
283 },
284 "outputs": [],
285 "source": [
286 "par(mfrow=c(1,2))\n",
98e958ca
BA
287 "plotRelativeVariability(data, f4_ep)\n",
288 "plotRelativeVariability(data, f6_ep)\n",
69bcd8bc 289 "\n",
98e958ca 290 "#Variabilité globale en rouge ; sur nos 60 voisins (+ lendemains) en noir"
69bcd8bc
BA
291 ]
292 },
293 {
294 "cell_type": "code",
295 "execution_count": null,
296 "metadata": {
297 "collapsed": false
298 },
299 "outputs": [],
300 "source": [
301 "par(mfrow=c(1,2))\n",
302 "plotSimils(p_ep_nn, 4)\n",
303 "plotSimils(p_ep_nn, 6)"
fa8078f9
BA
304 ]
305 },
99f83c9a
BA
306 {
307 "cell_type": "code",
308 "execution_count": null,
309 "metadata": {
310 "collapsed": false
311 },
312 "outputs": [],
313 "source": [
314 "#Fenêtres sélectionnées dans ]0,10] / endo à gauche, exo à droite\n",
315 "p_ep_nn$getParams(4)$window\n",
316 "p_ep_nn$getParams(6)$window"
317 ]
318 },
69bcd8bc
BA
319 {
320 "cell_type": "markdown",
321 "metadata": {},
322 "source": [
323 "<h2 style=\"color:blue;font-size:2em\">Semaine non polluée</h2>"
fa8078f9
BA
324 ]
325 },
326 {
327 "cell_type": "code",
328 "execution_count": null,
329 "metadata": {
330 "collapsed": false
331 },
332 "outputs": [],
333 "source": [
69bcd8bc 334 "indices_np = seq(as.Date(\"2015-04-26\"),as.Date(\"2015-05-02\"),\"days\")\n",
99f83c9a
BA
335 "p_np_nn = computeForecast(data,indices_np, \"Neighbors\", \"Neighbors\", simtype=\"mix\")\n",
336 "p_np_pz = computeForecast(data, indices_np, \"Persistence\", \"Zero\", same_day=FALSE)\n",
337 "p_np_az = computeForecast(data, indices_np, \"Average\", \"Zero\") #, memory=183)\n",
338 "#p_np_zz = computeForecast(data, indices_np, \"Zero\", \"Zero\")"
fa8078f9
BA
339 ]
340 },
341 {
342 "cell_type": "code",
343 "execution_count": null,
344 "metadata": {
345 "collapsed": false
346 },
347 "outputs": [],
348 "source": [
99f83c9a
BA
349 "e_np_nn = computeError(data, p_np_nn)\n",
350 "e_np_pz = computeError(data, p_np_pz)\n",
351 "e_np_az = computeError(data, p_np_az)\n",
352 "#e_np_zz = computeError(data, p_np_zz)\n",
69bcd8bc 353 "options(repr.plot.width=9, repr.plot.height=7)\n",
fa8078f9
BA
354 "plotError(list(e_np_nn, e_np_pz, e_np_az), cols=c(1,2,colors()[258]))\n",
355 "\n",
356 "#Noir: neighbors, rouge: persistence, vert: moyenne"
357 ]
358 },
fa8078f9
BA
359 {
360 "cell_type": "code",
361 "execution_count": null,
362 "metadata": {
363 "collapsed": false
364 },
365 "outputs": [],
366 "source": [
367 "par(mfrow=c(1,2))\n",
368 "options(repr.plot.width=9, repr.plot.height=4)\n",
99f83c9a 369 "plotPredReal(data, p_np_nn, 5)\n",
fa8078f9
BA
370 "plotPredReal(data, p_np_nn, 6)\n",
371 "\n",
372 "#Bleu: prévue, noir: réalisée"
373 ]
374 },
fa8078f9
BA
375 {
376 "cell_type": "code",
377 "execution_count": null,
378 "metadata": {
379 "collapsed": false
380 },
381 "outputs": [],
382 "source": [
383 "par(mfrow=c(1,2))\n",
99f83c9a 384 "plotPredReal(data, p_np_az, 5)\n",
fa8078f9
BA
385 "plotPredReal(data, p_np_az, 6)"
386 ]
387 },
fa8078f9
BA
388 {
389 "cell_type": "code",
390 "execution_count": null,
391 "metadata": {
392 "collapsed": false
393 },
394 "outputs": [],
395 "source": [
396 "par(mfrow=c(1,2))\n",
99f83c9a 397 "f5_np = computeFilaments(data, p_np_nn$getIndexInData(5), plot=TRUE)\n",
69bcd8bc
BA
398 "f6_np = computeFilaments(data, p_np_nn$getIndexInData(6), plot=TRUE)"
399 ]
400 },
401 {
402 "cell_type": "code",
403 "execution_count": null,
404 "metadata": {
405 "collapsed": false
406 },
407 "outputs": [],
408 "source": [
409 "par(mfrow=c(2,2))\n",
98e958ca
BA
410 "plotFilamentsBox(data, f5_np)\n",
411 "plotFilamentsBox(data, f6_np)\n",
69bcd8bc 412 "\n",
98e958ca 413 "#À gauche : jour 5 + lendemain (6) ; à droite : jour 6 + lendemain (7)"
69bcd8bc
BA
414 ]
415 },
416 {
417 "cell_type": "code",
418 "execution_count": null,
419 "metadata": {
420 "collapsed": false
421 },
422 "outputs": [],
423 "source": [
424 "par(mfrow=c(1,2))\n",
98e958ca
BA
425 "plotRelVar(data, f5_np)\n",
426 "plotRelVar(data, f6_np)\n",
69bcd8bc 427 "\n",
98e958ca 428 "#Variabilité globale en rouge ; sur nos 60 voisins (+ lendemains) en noir"
fa8078f9
BA
429 ]
430 },
431 {
432 "cell_type": "code",
433 "execution_count": null,
434 "metadata": {
435 "collapsed": false
436 },
437 "outputs": [],
438 "source": [
99f83c9a
BA
439 "par(mfrow=c(1,2))\n",
440 "plotSimils(p_np_nn, 5)\n",
fa8078f9
BA
441 "plotSimils(p_np_nn, 6)"
442 ]
443 },
99f83c9a
BA
444 {
445 "cell_type": "code",
446 "execution_count": null,
447 "metadata": {
448 "collapsed": false
449 },
450 "outputs": [],
451 "source": [
452 "#Fenêtres sélectionnées dans ]0,10] / endo à gauche, exo à droite\n",
453 "p_np_nn$getParams(5)$window\n",
454 "p_np_nn$getParams(6)$window"
455 ]
456 },
fa8078f9
BA
457 {
458 "cell_type": "markdown",
459 "metadata": {},
460 "source": [
461 "## Bilan\n",
462 "\n",
463 "Problème difficile : on ne fait guère mieux qu'une naïve moyenne des lendemains des jours similaires dans le passé, ce qui n'est pas loin de prédire une série constante égale à la dernière valeur observée (méthode \"zéro\"). La persistence donne parfois de bons résultats mais est trop instable (sensibilité à l'argument <code>same_day</code>).\n",
464 "\n",
465 "Comment améliorer la méthode ?"
466 ]
467 }
468 ],
469 "metadata": {
470 "kernelspec": {
471 "display_name": "R",
472 "language": "R",
473 "name": "ir"
474 },
475 "language_info": {
476 "codemirror_mode": "r",
477 "file_extension": ".r",
478 "mimetype": "text/x-r-source",
479 "name": "R",
480 "pygments_lexer": "r",
98e958ca 481 "version": "3.3.2"
fa8078f9
BA
482 }
483 },
484 "nbformat": 4,
485 "nbformat_minor": 2
486}