Skip to content

Weight problems in cdutil.ANNUALCYCLE.climatology (daily data) #4

@chaosphere2112

Description

@chaosphere2112

I have found out that when using cdutil.ANNUALCYCLE.climatology on daily data, the months of different years seem to be equally weighted.

This results in (slight but visible) errors:

  • for February: eg for years, when averaging the February values for 1959-1960-1961, the weights used seem to be [1, 1, 1], when they should be the actual month length: [28, 29, 28]
  • for months that have missing days: e.g. if the first month of the data file starts on January 2nd, but all the other January month in the time series have 31 days, the weights used seem to be [1, 1, 1, ...] when they should be [30, 31, 31, ..., 31]

Test script: https://files.lsce.ipsl.fr/public.php?service=files&t=115de1c1cb57217b85776fd08fe21b23
Test data: https://files.lsce.ipsl.fr/public.php?service=files&t=c7258c48f12d3451bfa67e5585bbedef

Running the test script above will get the following output, and you can see that the difference between ANNUALCYCLEclimatology and computation by hand for all months, including unweighted February is ~1e6. But there is a bigger difference when I compute a correctly weighted February mean

python -i bug_ANNUALCYCLEclimatology.py
Variable shape = (3653, 9, 11)
Time axes go from 1959-1-1 12:0:0.0 to 1968-12-31 12:0:0.0
Latitude values = [-66.7283256  -67.84978441 -68.97123954 -70.09269035 -71.21413608
 -72.33557575 -73.45700815 -74.57843166 -75.69984422]
Longitude values = [ 120.375  121.5    122.625  123.75   124.875  126.     127.125  128.25
  129.375  130.5    131.625]
Fri Nov  6 17:35:08 2015  - Computing the climatology with cdutil.ANNUALCYCLE.climatology,
                please wait...
Fri Nov  6 17:35:26 2015  - Done!
time.clock diff = 17.36

Fri Nov  6 17:35:26 2015  - Computing the climatology by hand...
year, month, nb of selected days = 1959  2 28
year, month, nb of selected days = 1960  2 29
year, month, nb of selected days = 1961  2 28
year, month, nb of selected days = 1962  2 28
year, month, nb of selected days = 1963  2 28
year, month, nb of selected days = 1964  2 29
year, month, nb of selected days = 1965  2 28
year, month, nb of selected days = 1966  2 28
year, month, nb of selected days = 1967  2 28
year, month, nb of selected days = 1968  2 29
Difference range for month 1 = (-4.903731820604662e-06, 6.318861416332311e-06)
Difference range for February (unweighted) = (-5.2588326724389844e-06, 5.322724128120626e-06)
Difference range for February (WEIGHTED) = (-0.014045400572531008, -0.0006070483053850495)
Difference range for month 3 = (-7.192550192769431e-06, 6.8172331566529465e-06)
Difference range for month 4 = (-8.519490563685395e-06, 6.39597574547679e-06)
Difference range for month 5 = (-7.727838379878449e-06, 7.407895935784836e-06)
Difference range for month 6 = (-8.519490563685395e-06, 6.81559244952723e-06)
Difference range for month 7 = (-7.284841260002395e-06, 9.992045690410123e-06)
Difference range for month 8 = (-8.761498278886393e-06, 7.900114923131696e-06)
Difference range for month 9 = (-8.850097671597723e-06, 7.451375310552066e-06)
Difference range for month 10 = (-7.30945221505408e-06, 6.18965391652182e-06)
Difference range for month 11 = (-6.2370300426550784e-06, 6.243387851156967e-06)
Difference range for month 12 = (-6.288097772255696e-06, 6.534207223296562e-06)
Fri Nov  6 17:35:26 2015  - Finished computing the climatology by hand!
time.clock diff = 0.3
Acceleration = 57.8666666667

Migrated from: CDAT/cdat#1664

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions