Some work on IMD Pune's gridded data sets Author: Nikhil VJ, https://nikhilvj.co.in
Source URL: https://imdpune.gov.in/Clim_Pred_LRF_New/Grided_Data_Download.html | Alternate: https://imdpune.gov.in/lrfindex.php -> See under Gridded data in side menu.
Intentions of this project: To make this data more accessible for people, to show much simpler code to extract the data than what I've seen online. And to get my hands dirty on a large trove of Indian open data :)
Note: All code is in python Install imdlib package Assuming that you've downloaded 2010 data file, saved it as "2010.grd" in "rain" folder next to the program :
import imdlib
rain1 = imdlib.open_data('rain', 2010, 2010, 'yearwise').get_xarray().to_dataframe()
rain2 = rain1[rain1['rain'] > -100].reset_index()
rain2.to_csv('rain_2010.csv',index=False)
The functions .get_xarray()
, .to_dataframe()
and .reset_index()
do the job of converting the multi-dimensional dataset into a flat table that can be saved to CSV, excel, etc.
Tip: Do this in Jupyter Notebook, do just till .get_xarray() and then print the variable directly in a cell. It's beautiful.
More extended tutorial in the author's blog: https://saswatanandi.github.io/softwares/imdlib/
{"2020-01-01": {"rain": 0.0, "tmax": 17.580678939819336, "tmin": 3.262223720550537},
"2020-01-02": {"rain": 0.0, "tmax": 20.557445526123047, "tmin": 6.12726354598999},
"2020-01-03": {"rain": 0.0, "tmax": 21.97892951965332, "tmin": 7.391693115234375},
"2020-01-04": {"rain": 0.0, "tmax": 22.463241577148438, "tmin": 7.331312656402588},
"2020-01-05": {"rain": 0.0, "tmax": 22.185802459716797, "tmin": 7.799867630004883},
"2020-01-06": {"rain": 0.0, "tmax": 19.416574478149414, "tmin": 10.629579544067383},
"2020-01-07": {"rain": 0.0, "tmax": 18.216487884521484, "tmin": 10.342350959777832},
"2020-01-08": {"rain": 0.2756316661834717, "tmax": 17.91546058654785, "tmin": 9.598325729370117},
"2020-01-09": {"rain": 0.0, "tmax": 18.36847496032715, "tmin": 4.737661838531494},
"2020-01-10": {"rain": 0.0, "tmax": 19.597763061523438, "tmin": 3.922478437423706},
"2020-01-11": {"rain": 0.0, "tmax": 21.578903198242188, "tmin": 6.793002605438232},
"2020-01-12": {"rain": 0.0, "tmax": 23.62282943725586, "tmin": 8.204022407531738},
"2020-01-13": {"rain": 7.230362892150879, "tmax": 17.89722442626953, "tmin": 11.31865119934082},
"2020-01-14": {"rain": 0.5124186873435974, "tmax": 17.625137329101562, "tmin": 5.582608222961426},
"2020-01-15": {"rain": 0.0, "tmax": 17.577001571655273, "tmin": 4.793914794921875},
"2020-01-16": {"rain": 0.0, "tmax": 16.759170532226562, "tmin": 6.7936177253723145},
"2020-01-17": {"rain": 0.0, "tmax": 19.58401870727539, "tmin": 5.236929416656494},
"2020-01-18": {"rain": 0.0, "tmax": 19.54751205444336, "tmin": 5.679737567901611},
"2020-01-19": {"rain": 0.0, "tmax": 18.521821975708008, "tmin": 5.7684712409973145},
"2020-01-20": {"rain": 0.0, "tmax": 19.22909164428711, "tmin": 6.524430751800537},
"2020-01-21": {"rain": 0.0, "tmax": 21.767934799194336, "tmin": 8.751236915588379},
"2020-01-22": {"rain": 0.0, "tmax": 21.532318115234375, "tmin": 8.174297332763672},
"2020-01-23": {"rain": 0.0, "tmax": 21.776113510131836, "tmin": 7.345406532287598},
"2020-01-24": {"rain": 0.0, "tmax": 22.189123153686523, "tmin": 6.468899250030518},
"2020-01-25": {"rain": 0.0, "tmax": 24.130014419555664, "tmin": 6.97148323059082},
"2020-01-26": {"rain": 0.0, "tmax": 25.91004180908203, "tmin": 7.372500896453857},
"2020-01-27": {"rain": 0.0, "tmax": 23.614274978637695, "tmin": 11.573892593383789},
"2020-01-28": {"rain": 7.257974147796631, "tmax": 19.39422607421875, "tmin": 11.302903175354004},
"2020-01-29": {"rain": 0.0, "tmax": 21.57762336730957, "tmin": 6.99928092956543},
"2020-01-30": {"rain": 0.0, "tmax": 21.596620559692383, "tmin": 7.587302207946777},
"2020-01-31": {"rain": 0.0, "tmax": 21.153125762939453, "tmin": 6.719666004180908}}
df = pd.DataFrame(data).transpose().reset_index().rename(columns={'index':'date'})
Note: tmax and tmin were available at lower grid resolution than rainfall data, so in the DB table imd_data there will be locations that only have rainfall data.
imd_temp_data
and temp_grid
tables contain data and grid locations respectively of just the temperature records. They're much smaller in quantity than the rain records, so use these if you only want temperature data.
Check out the .ipynb Jupyter notebooks (python3 programs) here showing sample code to work with the data in Database once you have it ready.
For location 18.5,74.0 nr Pune, India, cumulative monthly rainfall from 1901 to 2021:
See 2022-07-14 rainfal viz 1.ipynb for the code that made this.