doctestprinter.print_pandas

doctestprinter.print_pandas(frame_or_series: Optional = None, formats: Optional[str] = None, max_line_count: Optional[int] = None, max_title_width: Optional[int] = None)

Prints explicitly pandas.DataFrame and pandas.Series objects.

Parameters
  • frame_or_series – The DataFrame or Series to be print.

  • formats (str; optional) – Concatenated format specifiers for the index column(s) and value column(s). Default {:>f}#{:>f}

  • max_line_count (int; optional) – Defines the max lines which should be print. Its half wide is used to show either head and tail. Default is 60.

  • max_title_width (int; optional) – Maximum printed width of the column title. Default is 16.

Warning

This method doesn’t reproduce the exact representation as pandas does. This is intentional as this function was written to takle issues regarding usage of doctests in combination with pytest.

Notes

This function was written in response to failed doctests in pytest due to changed formatting behaviour in between running the doctest outside and inside pytest.

Other reasons are the different NaN value representation, representation of floats on different operating systems and lack of defining a string formatting behavior for each column.

In between different python versions, which changes between nan and NaN. In cases of running the tests on Linux and Windows on different machines the float representation in the footer of a Series print changes leading to failed doctests. First problem is solved by lowercase NaN to nan and second case is solved by a changed Series representation.

Examples

The default format specification is {:>g}. But the intention of this function is to define a specific format specification for each test to fix the doctest result for any python version within a tox test.

>>> single_index_test_frame = DataFrame(
...     np.linspace(1/3, 1000/3, num=4).reshape(2, 2),
...     columns=["x", "y"],
...     index=Index([0.1, 0.211], name="t")
... )
>>> print_pandas(single_index_test_frame)
                x        y
t
0.100    0.333333  111.333
0.211  222.333333  333.333
>>> print_pandas(single_index_test_frame, "{:>.1f}#{:>4.1f}{:>e}")
         x             y
t
0.1    0.3  1.113333e+02
0.2  222.3  3.333333e+02

The main task of print_pandas() is the possibility to fix the format of each column within a DataFrame.

>>> index_items = zip("aabb", np.linspace(0.0, 1/3, num=4))
>>> sample_index = MultiIndex.from_tuples(index_items, names=["x1", "x2"])
>>> sample_frame = DataFrame(
...     np.linspace(1/3, 1000/3, num=12).reshape(4, 3),
...     index=sample_index,
...     columns=["Alpha", "Beta", "G"]
... )
>>> print_pandas(sample_frame, "{:>4}{:>.2f}#{:.0f}{:.1e}{:.5g}")
            Alpha     Beta        G
x1    x2
   a  0.00      0  3.1e+01   60.879
      0.11     91  1.2e+02  151.697
   b  0.22    182  2.1e+02  242.515
      0.33    273  3.0e+02  333.333