# sell_prices.csv.zip 
# Source data: https://www.kaggle.com/c/m5-forecasting-uncertainty/

df = pd.read_csv('data/sell_prices.csv')

report_on_dataframe(df)

report_on_dataframe shows you the possible dtype conversion and the improvement. Note that the library try to optimize the memory base on current values of the data, you should still be careful about overflow for further transformation.

if __name__ == "__main__":
    print("Given a dataframe, check for lowest possible conversions:")

    nbr_rows = 100
    df = pd.DataFrame()
    df["a"] = [0] * nbr_rows
    df["b"] = [256] * nbr_rows
    df["c"] = [65_536] * nbr_rows
    df["d"] = [1_100.0] * nbr_rows
    df["e"] = [100_101.0] * nbr_rows
    df["str_a"] = ["hello"] * nbr_rows
    df["str_b"] = [str(n) for n in range(nbr_rows)]
    report_on_dataframe(df)

    print("convert_dtypes does a slightly different job:")
    print(df.convert_dtypes())

Given a dataframe, check for lowest possible conversions:
convert_dtypes does a slightly different job:
    a    b      c     d       e  str_a str_b
0   0  256  65536  1100  100101  hello     0
1   0  256  65536  1100  100101  hello     1
2   0  256  65536  1100  100101  hello     2
3   0  256  65536  1100  100101  hello     3
4   0  256  65536  1100  100101  hello     4
.. ..  ...    ...   ...     ...    ...   ...
95  0  256  65536  1100  100101  hello    95
96  0  256  65536  1100  100101  hello    96
97  0  256  65536  1100  100101  hello    97
98  0  256  65536  1100  100101  hello    98
99  0  256  65536  1100  100101  hello    99

[100 rows x 7 columns]

	Current dtype	Proposed dtype	Current Memory (MB)	Proposed Memory (MB)	Ram Usage Improvement (MB)	Ram Usage Improvement (%)
Column
store_id	object	category	203763.920410	3340.907715	200423.012695	98.360403
item_id	object	category	233039.977539	6824.677734	226215.299805	97.071456
wm_yr_wk	int64	int16	26723.191406	6680.844727	20042.346680	74.999825
sell_price	float64	None	26723.191406	NaN	NaN	NaN

dtype_diet

`count_errors`[source]

`map_dtypes_to_choices`[source]

`get_smallest_valid_conversion`[source]

`get_improvement`[source]

`report_on_dataframe`[source]

`optimize_dtypes`[source]

dtype_diet

count_errors[source]

map_dtypes_to_choices[source]

get_smallest_valid_conversion[source]

get_improvement[source]