Package functions

gfs_functions

This repository is a part of the technology stack of gfs-zürich. It contains functions to create graphs and tables with python.

Project objectives

  • streamline preprocessing of data files
  • streamline table creation for data files
  • streamline graphic creation for data files
  • create presentations

Technologies

  • Python
  • Jupyter Notebooks
  • SPSS Datafiles
  • Excel
  • JSON
  • API
  • Pandas

Requirements

  1. Installed Python 3.10.6 on your local machine

Getting-Started

  1. Clone this repository (for help see this tutorial).
  2. Install or update your Virtual Environment by following this section Pipenv virtual environment

Virtual-Environment

Install Pipenv Package

In this project, we are using Pipenv for dependency management. Thus, you need the pipenv CLI tool installed on your computer:

  • open cmd or powershell
  • pip install pipenv

First install of Environment

As soon this package is installed, you can install all required packages for this project by executing the following command inside the root project folder:

  • open cmd or powershell
  • cd /your/local/github/repofolder/
  • pipenv install
  • restart VSCode
  • choose the newly created gfs_functions virtual environment python interpreter (for help see this tutorial

Environment already installed (update dependencies)

If your environment exists and you only want to update the dependencies use these steps:

  • open cmd or powershell
  • cd /your/local/github/repofolder/
  • pipenv sync

Add new Dependency

If you need another dependency, not yet defined in the Pipfile, you can install it using this command and it will also be added to the dependency list.

  • open cmd or powershell
  • cd /your/local/github/repofolder/
  • pipenv install <package>

File & Folder Structure

+---.vscode            # VS-Code settings
+---data               # Data Folder with example files
+---docs               # Automatically generated documentation files
+---functions          # Main function folder
|   +---graphy         # Functions for graph creation
|   +---matrixmixer    # Functions for table creation
|   +---metatables     # Functions for SPSS file data preprocessing
|   \---tools          # Functions for general tasks
+---gfs_projects       # Repository with project files
+---output             # Folder for outputs
+---resources          # Folder with resources for maps
+---templates          # Folder with all template files
|   +---fonts          # .ttf files with project fonts
|   +---logo           # gfs logo files
|   +---mapping        # JSON files with standard variable mappings
|   +---powerpoint     # Templates of Powerpoint presentations
|   +---shapefiles     # Shapefiles for maps
|   \---translations   # JSON files used for language translations
+---test               # test functions
+---uml                # UML diagrams for documentation

Add documentation

The documentation will be added to the gfs-zurich.github.io repository. To get the documentation into the repository, clone it into the ./docs folder as functions. Then run the following command in the root folder of this repository to create the updated documentation:

pdoc --html --output-dir docs functions --config show_source_code=False -f

or use (be sure to install draw.io.exe locally)

pipenv run .\documentation.bat

UML

Workflow für Metatable / Datapreprocessing

Workflow für Metatable / Datapreprocessing

Removing Jupyter notebook output when committing to a Git repository

You can remove the Jupyter notebook output without your interaction by using a Git hook. Here are the steps to set up a pre-commit hook to automatically remove the output from the Jupyter notebook cells before committing the code:

  • Open your terminal and navigate to the Git repository where your Jupyter notebook is located.
  • Create a new file called pre-commit in the .git/hooks directory of the repository by running the following command:

type nul > .git/hooks/pre-commit

  • Open the pre-commit file in a text editor and add the following code:

#!/bin/bash

# Find Jupyter notebook files that have been changed
changed_notebooks=$(git diff --cached --name-only --diff-filter=ACM | grep '.ipynb$')

# If no notebooks have been changed, exit with a success status
if [ -z "$changed_notebooks" ]; then
  exit 0
fi

# Remove output from changed notebooks
for notebook in $changed_notebooks; do
  jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace "$notebook"
done

# Add changes to the commit
git add $changed_notebooks

# Exit with a success status
exit 0


  • Save and close the file.

Now, every time you run the git commit command, the pre-commit script will automatically remove the output from the Jupyter notebook cells before committing the code, without any further interaction required from you. By modifying the pre-commit script in this way, you can ensure that the Jupyter notebook output is removed only from the notebooks that are being committed, and not from all notebooks in the repository.

Sub-modules

functions.aioli
functions.graphy

gfs-graphy …

functions.matrixmixer

gfs-matrixmixer …

functions.metatables

gfs-metatables …

functions.tools

gfs-tools These are tools for various tasks inside the technology stack of gfs-zürich.

Functions

def add_combined_graphs_slide(info: dict, figures: list, title_text: str, subtitle_text: str, tag_text: str, group_images: bool = True, share_y_labels: bool = True) ‑> None

combines a list of plotly figures onto a grid and saves it as a powerpoint slied

Args

info : dict
The presentation object
figures : list
list of plotly figures, make sure to use single instances of them, in doubt make a deepcopy
title_text : str
title_text
subtitle_text : str
subtitle_text
tag_text : str
tag_text
group_images : bool, optional
determines whether images are grouped in the slide. Defaults to True.
share_y_labels : bool, optional
determines whether images share the y axis labels row-wise. Defaults to True.

Raises

ValueError
if an invalid number of figures is entered, only 2 to 9 figures are valid
def add_table_slide(info: dict, title_text: str, table: pandas.core.frame.DataFrame, table_width: float = 22.5, table_height: float = 5.0, table_top_distance: float = 4.5, table_left_distance: float = 1.5, column_titles: Optional[None] = None, column_alignments: Optional[None] = None, column_widths: Optional[None] = None, bold_cells: Optional[None] = None, font_size: int = 14, row_heights: Optional[None] = None, cells_with_custom_color: Optional[None] = None, show_table_background_color: bool = True, title_font_size: int = 20, table_font_name: str = 'Leelawadee UI', cells_with_hyperlinks: Optional[None] = None) ‑> None

Adds a table slide to the presentation

Args

info : dict
The presentation object
title_text : str
The title of the slide
table : pd.DataFrame
The table that should be added to the slide
table_width : float, optional
The width of the table. Defaults to 22.5.
table_height : float, optional
The height of the table. Defaults to 5.0.
column_alignments : Union[None, list[str]], optional
The alignment of the columns. Defaults to None. If None, all columns will be aligned left. If a list is passed, it should contain the alignment for each column in the table. The alignment can be 1 (left), 2 (center), or 3 (right).
column_titles : Union[None, list[str]], optional
The titles of the columns. Defaults to None. If None, the table will not have column titles. If a list is passed, it should contain the title for each column in the table.
column_widths : Union[None, list[int]], optional
The width of the columns. Defaults to None. If None, all columns will have the same width. If a list is passed, it should contain the width for each column in the table. The values should be integers and only the relative width is important.
bold_cells : Union[None, list[tuple[int, int]]], optional
The cells that should be bold. Defaults to None. If None, no cells will be bold. If a list is passed, it should contain the row and column index of the cells that should be bold.
font_size : int, optional
The font size of the text in the table. Defaults to 14.
row_heights : Union[None, list[float]], optional
The height of the rows. Defaults to None. If None, all rows will have the automatic height. If a list is passed, it should contain the height for each row in the table.
cells_with_custom_color : Union[None, dict], optional
The cells that should have a custom color. Defaults to None. If None, no cells will have a custom color. If a dictionary is passed, it should contain the row and column index of the cells as keys and the color as values. The color should be a string with the hex code of the color.
show_table_background_color : bool, optional
Determines if the table background color should be shown. Defaults to True.
title_font_size : int, optional
The font size of the title. Defaults to 20.
table_font_name : str, optional
The name of the font used in the whole table.
cells_with_hyperlinks : Union[None, dict], optional
The cells that should have a hyperlink. Defaults to None. If None, no cells will have a hyperlink. If a dictionary is passed, it should contain the row and column index of the cells as keys and the hyperlink as values. The hyperlink should be a string with the address of the hyperlink.
def add_text_slide(info: dict, text: Union[list[dict], list[str]], my_title: str = 'gfs-zürich, Markt- & Sozialforschung', vertical_alignment: str = 'top') ‑> None

Adds a text slide to the presentation

Args

info : dict
The presentation object
text : Union[list[dict], list[str]]
The text that should be added to the slide. If a list of dictionaries is passed, each dictionary should contain a "text" key with the text that should be added to the slide. Additionally, the dictionary can contain a "size" key with the font size of the text, a "bold" key with a boolean value that determines if the text should be bold, a "hyperlink" key that adds a link as the value, a "alignment" key that changes the text alignment (left, center, right) and a "color" key with the color of the text. If a list of strings is passed, each string will be added to the slide as a paragraph.
my_title : str, optional
The title of the slide. Defaults to 'gfs-zürich, Markt- & Sozialforschung'.
vertical_alignment : str, optional
The vertical alignment of the text. Defaults to 'top'.
def create_age_break(metatable: MetaTable2, original_variable: str = 'S12_1', new_break_name: str = 'age_break', dont_know_code: bool = False) ‑> None

Creates a standard age break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_apartment_search_break(metatable: MetaTable2, original_variable: str = 'WOHNUNGSSUCHE', new_break_name: str = 'apartment_search_break', dont_know_code: bool = False) ‑> None

Creates a standard apartment_search break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_apartment_search_simple_break(metatable: MetaTable2, original_variable: str = 'WOHNUNGSSUCHE', new_break_name: str = 'apartment_search_break', dont_know_code: bool = False) ‑> None

Creates a standard apartment_search break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_barchart(info: dict, variables: list, breaks: Optional[list] = None, legend_break: Optional[list] = None, barchart_mean: bool = False, break_labels_rename: Optional[dict[str, str]] = None, left_labels_rename: Optional[dict[str, str]] = None, legend_labels_rename: Optional[dict[str, str]] = None, left_labels_wrap: int = 50, left_labels_width: int = 999, show_mean: bool = False, title_custom_text: str = 'auto', subtitle_custom_text: str = 'auto', tag_custom_text: str = 'auto', legend_title_custom_text: str = 'auto', title_position: int = 1, subtitle_position: int = 2, tag_position: int = 3, legend_position: str = 'right', return_data: bool = False, color_theme: str = 'auto', color_type: str = 'groups', color_direction: str = 'forward', color_custom: Optional[None] = None, value_labels_show_min: float = 0, value_labels_add_after: Optional[list[list[str]]] = None, value_labels_transform: function = <function <lambda>>, weight: str = 'auto', order_by: Optional[None] = None, title_remove_before: str = '', title_remove_after: str = '', title_add_end: str = '', tag_add_before: str = '', tag_add_after: str = '', subtitle_add_before: str = '', subtitle_add_after: str = '', left_labels_remove_before: str = '', left_labels_remove_after: str = '', bar_gap: float = 999, bar_width: float = 999, bar_min_size: float = 0, break_distance: float = 999, label_size: float = 16, value_labels_size: float = 999, legend_labels_size: float = 14, show_legend: bool = True, show_total_legend: bool = True, show_total: bool = True, show_count: bool = False, show_index: bool = False, show_count_legend: bool = False, show_index_legend: bool = False, show_all_bars: bool = False, show_legend_title: Optional[bool] = None, select_variable_levels: Optional[None] = None, select_variables: Optional[None] = None, select_min_count: int = 0, special_variables: list[int] = [96, 97, 98, 99999996, 99999997, 99999998], standard_arguments: Optional[None] = None, save_figure: bool = True, return_figure: bool = False, df: Union[pandas.core.frame.DataFrame, str] = 'auto', meta: pyreadstat._readstat_parser.metadata_container = 'auto') ‑> Optional[tuple]

Creates a barchart

Args

info : dict
A dictionary with info about the presentation. Has info like number of slides, presentation name and the presentation object itself ~ usually info
variables : list
A list of variables. If multiple are added, function doesn't accept any breaks. ~ ['AW2'] / ['AW1_1', 'AW1_2', …]
breaks : list, optional
A list of breaks. ~ ['alter_break', 'sex_break']. Defaults to None.
legend_break : list, optional
A list of breaks. This is used to create the legend break if either "mean" == True or "select_variable_levels" has exactly one item ~ ['alter_break'] / [['alter_break']]. Defaults to None.
barchart_mean : bool, optional
Toggles barchart type mean, where mean value of variable is shown. Defaults to False.
break_labels_rename : dict[str, str], optional
Renames the breaks. ~ {'old_name': 'new_name'}. Defaults to None.
left_labels_rename : dict[str, str], optional
Renames the labels of the different break levels. ~ {'old_name', 'new_name'}. Defaults to None.
legend_labels_rename : dict[str, str], optional
Renames the labels of the different legend levels. ~ {'old_name', 'new_name'}. Defaults to None.
left_labels_wrap : int, optional
Wraps the variable labels. Defaults to 50.
left_labels_width : int, optional
Changes the space allocated to the y-label. Defaults to 999 (which means automatic).
show_mean : bool, optional
Turns uses the mean of the variable as x-value and changes scale type from percentage to mean. ~ True. Defaults to False.
title_custom_text : str, optional
Creates custom title text, if automatic text isn't desired.Defaults to 'auto'.
subtitle_custom_text : str, optional
Creates custom subtitle text, if automatic text isn't desired. Defaults to 'auto'.
tag_custom_text : str, optional
Creates custom tag text, if automatic text isn't desired. Defaults to 'auto'.
legend_title_custom_text : str, optional
Creates a custom legend title, if title is desired. Defaults to 'auto'.
title_position : int, optional
Sets the position of the title text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 1.
subtitle_position : int, optional
Sets the position of the subtitle text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 2.
tag_position : int, optional
Sets the position of the tag text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 3.
legend_position : str, optional
Sets the position of the legend. ~ 'bottom'. Defaults to 'right'.
return_data : bool, optional
Returns the data frame, that is used to construct the graph, so it can be viewed. The data frame will be printed out if the parameter is True. This doesn't affect the graph in any way. Defaults to False.
color_theme : str, optional
Sets color theme for graph. ~ 'test', 'gfs'. Defaults to 'auto'.
color_type : str, optional
Sets color value type. ~ 'diverging', 'single_hue'. Defaults to 'groups'.
color_direction : str, optional
Sets direction of color pattern. ~ 'backward'. Defaults to 'forward'.
color_custom : list[str], optional
Uses custom colors ~ ['#131366', '#454578']. Defaults to None.
value_labels_show_min : float, optional
Sets threshold showing labels by %. Use 0 to show all labels and 100 to show none. Use numbers between for a mix. Defaults to 0.
value_labels_add_after : list[list[str]], optional
Adds text after the value labels ~ [['text', 'text'], ['text', 'text']]. Defaults to None.
weight : str, optional
Turns weight on if weight variable name is added. ~ "gewicht". Defaults to 'auto'.
order_by : dict, optional
A dictionary. Orders the plot by any variable in the x data frame. ~ {'mean': 'asc'} / {1: 'desc'}. Defaults to None.
title_remove_before : str, optional
Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
title_remove_after : str, optional
Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark that. Defaults to ''.
title_add_end : str, optional
Appends given sting to the label text. Defaults to ''.
tag_add_before : str, optional
Adds text before the tag text. ~ "Meine Zusatzinfo, ', " adds this string before tag. Defaults to ''.
tag_add_after : str, optional
Adds text after the tag text. ~ "Meine Zusatzinfo, ', " adds this string after tag. Defaults to ''.
subtitle_add_before : str, optional
Adds text before the tag text. ~ "Filter: F3 = 'Ja', " adds this string before tag. Defaults to ''.
subtitle_add_after : str, optional
Adds text after the tag text. ~ "Filter: F3 = 'Ja', " adds this string after tag. Defaults to ''.
left_labels_remove_before : str, optional
Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
left_labels_remove_after : str, optional
Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark. Defaults to ''.
bar_gap : int, optional
Sets the gap between the bars within a group. Defaults to 0.15.
bar_width : float, optional
Sets the width of the bars. Defaults to 0.7.
bar_min_size : float, optional
Sets the minimum size of the show bars. Only affects plot if show_all_bars is set to "False". Defaults to 0.
break_distance : float, optional
Sets the distance between the breaks. Defaults to 0.5
label_size : float, optional
Sets the size of labels. Defaults to 16.
value_labels_size : float, optional
Sets the size of value labels. Defaults to 999.
value_labels_transform : LambdaType, optional
Changes the size of the value_labels values with a transformation. lambda n: -n + 2 changes n to the negative value of n and adds 2. Defaults to lambda n: n.
legend_labels_size : float, optional
Sets the size of legend. Defaults to 14.
show_legend : bool, optional
Turns legend on and off. ~ False. Defaults to True.
show_total_legend : bool, optional
Turns Total in legend on and off. ~ False. Defaults to True.
show_total : bool, optional
Turns total bar on and off. ~ False. Defaults to True.
show_count : bool, optional
Turns counts in variable labels on and off. ~ True. Defaults to False.
show_index : bool, optional
Turns index in variable labels on and off. ~ True. Defaults to False.
show_count_legend : bool, optional
Turns counts in legend labels on and off. ~ True. Defaults to False.
show_index_legend : bool, optional
Turns index in legend labels on and off. ~ True. Defaults to False.
show_all_bars : bool, optional
Turns counts bars with zero observations on and off. ~ True. Defaults to False.
show_legend_title : bool, optional
Toggles legend title. ~ True / False. Defaults to None, which means automatic.
select_variable_levels : list[int], optional
A list variable levels used to select and order variables. ~ [2, 1, 99999997]. Defaults to None.
select_min_count : int, optional
Selects all bars with higher count than given. ~ 10 means that only bars with more than 10 observations will be selected. Defaults to 0.
special_variables : list[int], optional
A list variable levels that will be treated differently. They will appear gray in the graph and have not effect on the mean calculation. ~ [16, 99999997]. Defaults to [96, 97, 98, 99999996, 99999997, 99999998].
standard_arguments : dict, optional
Adds some standard arguments that normally don't need to be changed. Defaults to None.
save_figure : dict, optional
Turns figure export on and off. Defaults to True.
return_figure : bool, optional
Turns figure return on and off. Defaults to False.
df : pd.DataFrame, optional
The current data file ~ usually df. Defaults to 'auto'.
meta : pyreadstat._readstat_parser.metadata_container, optional
The current meta data file ~ usually meta. Defaults to 'auto'.

Returns

Optional[tuple]
returns df and meta objects if return_data is True
def create_canton_break(metatable: MetaTable2, original_variable: str, original_variable_type: str, new_break_name: str, dont_know_code: bool = False) ‑> None

Creates a canton break based on a JSON mapping file.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
original_variable_type : str
Type of the original variable. "gdenr" if it represents the Gemeindenummer.
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_cantonal_banc_break(metatable: MetaTable2, original_variable: str = 'KANTONALBANK', new_break_name: str = 'kantonalbank_break', dont_know_code: bool = False) ‑> None

Creates a standard kantonalbank break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_count_employee_break(metatable: MetaTable2, original_variable: str, new_break_name: str = 'count_employee_break', dont_know_code: bool = False) ‑> None

Creates a standard count employee break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
def create_custom_break(metatable: MetaTable2, original_variable: str, new_break_name: str, recode_values: tuple[dict[int, range], bool], value_labels: dict[int, str], single_label: str, measure: str = 'nominal') ‑> None

Creates a custom break in the specified Metatable.

Args

metatable : MetaTable2
The Metatable that needs editing.
original_variable : str
Name of the original variable.
new_break_name : str
Name of the newly created break.
recode_values : Tuple[Dict[int, range], bool]
Tuple with recode values and whether to keep untouched codes, e.g., ({1: range(1, 6), 2: range(6, 11)}, True).
value_labels : Dict[int, str]
Value labels dictionary, e.g., {1: '1-5', 2: '6-10'}.
single_label : str
Single label of the break.
measure : str
Break measure, nominal, scale, etc. Default is 'nominal'.
def create_education_break(metatable: MetaTable2, original_variable: str = 'S15', new_break_name: str = 'education_break', dont_know_code: bool = False) ‑> None

Creates a standard education break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_employment_break(metatable: MetaTable2, original_variable: str = 'S13', new_break_name: str = 'employment_break', dont_know_code: bool = False) ‑> None

Creates a standard employment break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_founding_year_break(metatable: MetaTable2, original_variable: str, new_break_name: str = 'founding_year_break', dont_know_code: bool = False) ‑> None

Creates a standard founding year break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_gdenr_from_plz(metatable: MetaTable2, original_variable: str, new_variable: str, dont_know_code: bool = False) ‑> None

Creates Gemeindenummer from PLZ based on a JSON mapping file.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_variable : str
Name of the newly created variable
dont_know_code : bool
should the dont know code be recoded
def create_gemeindegrösse_break(metatable: MetaTable2, original_variable: str, original_variable_type: str, new_break_name: str, dont_know_code: bool = False) ‑> None

Creates a gemeindegrösse break based on a JSON mapping file.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
original_variable_type : str
Type of the original variable. "gdenr" if it represents the Gemeindenummer.
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_gender_break(metatable: MetaTable2, original_variable: str = 'S11', new_break_name: str = 'gender_break', use_non_binary: bool = True, dont_know_code: bool = False) ‑> None

Creates a standard gender break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
use_non_binary : bool
Include "divers" gender if True, only use binary gender otherwise
dont_know_code : bool
should the dont know code be recoded
def create_grossregion_break(metatable: MetaTable2, original_variable: str, original_variable_type: str, new_break_name: str, dont_know_code: bool = False) ‑> None

Creates a grossregion break based on a JSON mapping file.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
original_variable_type : str
Type of the original variable. "gdenr" if it represents the Gemeindenummer.
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_income_break(metatable: MetaTable2, original_variable: str = 'S14', new_break_name: str = 'income_break', dont_know_code: bool = False) ‑> None

Creates a standard income break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_industry_sector_break(metatable: MetaTable2, original_variable: str, new_break_name: str = 'industry_sector_break', dont_know_code: bool = False, values: dict = {}, value_labels: dict = {}) ‑> None

Creates a standard industry sector (Branche, NOGA Code) break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
values : dict
dictionary with industry sector values (NOGA)
value_labels : dict
dictionary with industry sector value labels (NOGA)
def create_lead_time_break(metatable: MetaTable2, original_variable: str, new_break_name: str = 'lead_time_break', year: int = 2025, dont_know_code: bool = False) ‑> None

Creates a standard lead time break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_linechart(info: dict, variables: list, breaks: Optional[list] = None, legend_break: Optional[list] = None, linechart_mean: bool = False, break_labels_rename: Optional[dict[str, str]] = None, left_labels_rename: Optional[dict[str, str]] = None, legend_labels_rename: Optional[dict[str, str]] = None, left_labels_wrap: int = 50, left_labels_width: int = 999, show_mean: bool = False, title_custom_text: str = 'auto', subtitle_custom_text: str = 'auto', tag_custom_text: str = 'auto', legend_title_custom_text: str = 'auto', title_position: int = 1, subtitle_position: int = 2, tag_position: int = 3, legend_position: str = 'right', return_data: bool = False, color_theme: str = 'auto', color_type: str = 'groups', color_direction: str = 'forward', color_custom: Optional[None] = None, value_labels_show_min: float = 0, value_labels_add_after: Optional[list[list[str]]] = None, weight: str = 'auto', order_by: Optional[None] = None, title_remove_before: str = '', title_remove_after: str = '', title_add_end: str = '', tag_add_before: str = '', tag_add_after: str = '', subtitle_add_before: str = '', subtitle_add_after: str = '', left_labels_remove_before: str = '', left_labels_remove_after: str = '', line_width: int = 999, line_marker_size: int = 999, break_distance: float = 999, label_size: float = 16, value_labels_size: float = 999, legend_labels_size: float = 14, show_legend: bool = True, show_total_legend: bool = True, show_total: bool = True, show_count: bool = False, show_index: bool = False, show_count_legend: bool = False, show_index_legend: bool = False, show_legend_title: Optional[bool] = None, select_variable_levels: Optional[None] = None, select_min_count: int = 0, break_x_axis: bool = False, special_variables: list[int] = [96, 97, 98, 99999996, 99999997, 99999998], standard_arguments: Optional[None] = None, save_figure: bool = True, return_figure: bool = False, df: Union[pandas.core.frame.DataFrame, str] = 'auto', meta: pyreadstat._readstat_parser.metadata_container = 'auto') ‑> Optional[tuple]

Creates a linechart

Args

info : dict
A dictionary with info about the presentation. Has info like number of slides, presentation name and the presentation object itself ~ usually info
variables : list
A list of variables. If multiple are added, function doesn't accept any breaks. ~ ['AW2'] / ['AW1_1', 'AW1_2', …]
breaks : list, optional
A list of breaks. ~ ['alter_break', 'sex_break']. Defaults to None.
legend_break : list, optional
A list of breaks. This is used to create the legend break if either "mean" == True or "select_variable_levels" has exactly one item ~ ['alter_break'] / [['alter_break']]. Defaults to None.
linechart_mean : bool, optional
Toggles linechart type mean, where mean value of variable is shown. Defaults to False.
break_labels_rename : dict[str, str], optional
Renames the breaks. ~ {'old_name': 'new_name'}. Defaults to None.
left_labels_rename : dict[str, str], optional
Renames the labels of the different break levels. ~ {'old_name', 'new_name'}. Defaults to None.
legend_labels_rename : dict[str, str], optional
Renames the labels of the different legend levels. ~ {'old_name', 'new_name'}. Defaults to None.
left_labels_wrap : int, optional
Wraps the variable labels. Defaults to 50.
left_labels_width : int, optional
Changes the space allocated to the y-label. Defaults to 999 (which means automatic).
show_mean : bool, optional
Turns uses the mean of the variable as x-value and changes scale type from percentage to mean. ~ True. Defaults to False.
title_custom_text : str, optional
Creates custom title text, if automatic text isn't desired.Defaults to 'auto'.
subtitle_custom_text : str, optional
Creates custom subtitle text, if automatic text isn't desired. Defaults to 'auto'.
tag_custom_text : str, optional
Creates custom tag text, if automatic text isn't desired. Defaults to 'auto'.
legend_title_custom_text : str, optional
Creates a custom legend title, if title is desired. Defaults to 'auto'.
title_position : int, optional
Sets the position of the title text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 1.
subtitle_position : int, optional
Sets the position of the subtitle text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 2.
tag_position : int, optional
Sets the position of the tag text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 3.
legend_position : str, optional
Sets the position of the legend. ~ 'bottom'. Defaults to 'right'.
return_data : bool, optional
Returns the data frame, that is used to construct the graph, so it can be viewed. The data frame will be printed out if the parameter is True. This doesn't affect the graph in any way. Defaults to False.
color_theme : str, optional
Sets color theme for graph. ~ 'test', 'gfs'. Defaults to 'auto'.
color_type : str, optional
Sets color value type. ~ 'diverging', 'single_hue'. Defaults to 'groups'.
color_direction : str, optional
Sets direction of color pattern. ~ 'backward'. Defaults to 'forward'.
color_custom : list[str], optional
Uses custom colors ~ ['#131366', '#454578']. Defaults to None.
value_labels_show_min : float, optional
Sets threshold showing labels by %. Use 0 to show all labels and 100 to show none. Use numbers between for a mix. Defaults to 0.
value_labels_add_after : list[list[str]], optional
Adds text after the value labels ~ [['text', 'text'], ['text', 'text']]. Defaults to None.
weight : str, optional
Turns weight on if weight variable name is added. ~ "gewicht". Defaults to 'auto'.
order_by : dict, optional
A dictionary. Orders the plot by any variable in the x data frame. ~ {'mean': 'asc'} / {1: 'desc'}. Defaults to None.
title_remove_before : str, optional
Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
title_remove_after : str, optional
Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark that. Defaults to ''.
title_add_end : str, optional
Appends given sting to the label text. Defaults to ''.
tag_add_before : str, optional
Adds text before the tag text. ~ "Meine Zusatzinfo, ', " adds this string before tag. Defaults to ''.
tag_add_after : str, optional
Adds text after the tag text. ~ "Meine Zusatzinfo, ', " adds this string after tag. Defaults to ''.
subtitle_add_before : str, optional
Adds text before the tag text. ~ "Filter: F3 = 'Ja', " adds this string before tag. Defaults to ''.
subtitle_add_after : str, optional
Adds text after the tag text. ~ "Filter: F3 = 'Ja', " adds this string after tag. Defaults to ''.
left_labels_remove_before : str, optional
Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
left_labels_remove_after : str, optional
Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark. Defaults to ''.
bar_gap : int, optional
Sets the gap between the bars within a group. Defaults to 0.15.
bar_width : float, optional
Sets the width of the bars. Defaults to 0.7.
bar_min_size : float, optional
Sets the minimum size of the show bars. Only affects plot if show_all_bars is set to "False". Defaults to 0.
break_distance : float, optional
Sets the distance between the breaks. Defaults to 0.5
label_size : float, optional
Sets the size of labels. Defaults to 16.
value_labels_size : float, optional
Sets the size of value labels. Defaults to 999.
legend_labels_size : float, optional
Sets the size of legend. Defaults to 14.
show_legend : bool, optional
Turns legend on and off. ~ False. Defaults to True.
show_total_legend : bool, optional
Turns Total in legend on and off. ~ False. Defaults to True.
show_total : bool, optional
Turns total bar on and off. ~ False. Defaults to True.
show_count : bool, optional
Turns counts in variable labels on and off. ~ True. Defaults to False.
show_index : bool, optional
Turns index in variable labels on and off. ~ True. Defaults to False.
show_count_legend : bool, optional
Turns counts in legend labels on and off. ~ True. Defaults to False.
show_index_legend : bool, optional
Turns index in legend labels on and off. ~ True. Defaults to False.
show_all_bars : bool, optional
Turns counts bars with zero observations on and off. ~ True. Defaults to False.
show_legend_title : bool, optional
Toggles legend title. ~ True / False. Defaults to None, which means automatic.
select_variable_levels : list[int], optional
A list variable levels used to select and order variables. ~ [2, 1, 99999997]. Defaults to None.
select_min_count : int, optional
Selects all bars with higher count than given. ~ 10 means that only bars with more than 10 observations will be selected. Defaults to 0.
break_x_axis : bool, optional
Moves the breaks to the x Axis and the left labels to the legend, if you have only one variable this happens per default, mostly used for time-series data. Defaults to False.
special_variables : list[int], optional
A list variable levels that will be treated differently. They will appear gray in the graph and have not effect on the mean calculation. ~ [16, 99999997]. Defaults to [96, 97, 98, 99999996, 99999997, 99999998].
standard_arguments : dict, optional
Adds some standard arguments that normally don't need to be changed. Defaults to None.
save_figure : dict, optional
Turns figure export on and off. Defaults to True.
return_figure : bool, optional
Turns figure return on and off. Defaults to False.
df : pd.DataFrame, optional
The current data file ~ usually df. Defaults to 'auto'.
meta : pyreadstat._readstat_parser.metadata_container, optional
The current meta data file ~ usually meta. Defaults to 'auto'.

Returns

Optional[tuple]
returns df and meta objects if return_data is True
def create_market_change_break(metatable: MetaTable2, original_variable: str, new_break_name: str = 'market_change_break', dont_know_code: bool = False) ‑> None

Creates a standard market change break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_method_break(metatable: MetaTable2, original_variable: str = 'method_break', new_break_name: str = 'method_break', other_category_break: str = 'Andere', dont_know_code: bool = False) ‑> None

Creates a standard method break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
other_category_break : str
Name of the "Others" category
dont_know_code : bool
should the dont know code be recoded
def create_ownership_break(metatable: MetaTable2, original_variable: str = 'EIGENTUEMERMIETER', new_break_name: str = 'ownership_break', dont_know_code: bool = False) ‑> None

Creates a standard ownership break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_piechart(info: dict, variables: list, breaks: list = None, break_labels_rename: dict[str, str] = None, legend_labels_rename: dict[str, str] = None, legend_position: str = 'bottom', legend_title_custom_text: str = 'auto', title_custom_text: str = 'auto', subtitle_custom_text: str = 'auto', tag_custom_text: str = 'auto', title_position: int = 1, subtitle_position: int = 2, tag_position: int = 3, return_data: bool = False, color_theme: str = 'auto', color_type: str = 'groups', color_direction: str = 'forward', color_custom: list[str] = None, value_labels_color: int = 999, value_labels_show_min: float = 0.8, weight: str = 'auto', order_by: dict = None, title_remove_before: str = '', title_remove_after: str = '', tag_add_before: str = '', tag_add_after: str = '', subtitle_add_before: str = '', subtitle_add_after: str = '', value_labels_size: float = 16, legend_labels_size: float = 12, hole_size: float = 0.4, show_legend: bool = True, show_legend_title: Optional[bool] = None, show_count: bool = False, show_decimals: bool = False, select_variable_levels: list[int] = None, special_variables: list[int] = [96, 97, 98, 99999996, 99999997, 99999998], standard_arguments: dict = None, save_figure: bool = True, return_figure: bool = False, df: pandas.core.frame.DataFrame = 'auto', meta: pyreadstat._readstat_parser.metadata_container = 'auto')

Creates a piechart or donutchart, depends on hole size parameter

Args

info : dict
A dictionary with info about the presentation. Has info like number of slides, presentation name and the presentation object itself ~ usually info
variables : list
A list of variables. If multiple are added, function doesn't accept any breaks. ~ ['AW2'] / ['AW1_1', 'AW1_2', …]
breaks : list, optional
A list of breaks. ~ ['alter_break', 'sex_break']. Defaults to None.
break_labels_rename : dict[str, str], optional
Renames the breaks. ~ {'old_name': 'new_name'}. Defaults to None.
legend_labels_rename : dict[str, str], optional
Renames the labels of the different legend levels. ~ {'old_name', 'new_name'}. Defaults to None.
legend_position : str, optional
Sets the position of the legend. ~ 'bottom'. Defaults to 'right'. legend_title_custom_text (str, optional): Creates a custom legend title, if title is desired. Defaults to 'auto'.
title_custom_text : str, optional
Creates custom title text, if automatic text isn't desired.Defaults to 'auto'.
subtitle_custom_text : str, optional
Creates custom subtitle text, if automatic text isn't desired. Defaults to 'auto'.
tag_custom_text : str, optional
Creates custom tag text, if automatic text isn't desired. Defaults to 'auto'.
title_position : int, optional
Sets the position of the title text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 1.
subtitle_position : int, optional
Sets the position of the subtitle text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 2.
tag_position : int, optional
Sets the position of the tag text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 3.
return_data : bool, optional
Returns the data frame, that is used to construct the graph, so it can be viewed. The data frame will be printed out if the parameter is True. This doesn't affect the graph in any way. Defaults to False.
color_theme : str, optional
Sets color theme for graph. ~ 'test', 'gfs'. Defaults to 'auto'.
color_type : str, optional
Sets color value type. ~ 'diverging', 'single_hue'. Defaults to 'groups'.
color_direction : str, optional
Sets direction of color pattern. ~ 'backward'. Defaults to 'forward'.
color_custom : list[str], optional
Uses custom colors ~ ['#131366', '#454578']. Defaults to None.
value_labels_color : int, optional
Sets threshold for black or white label color. Use 0 for all black labels and 255 for all white. Use numbers between for a mix. Defaults to 999.
value_labels_show_min : float, optional
Sets threshold showing labels by %. Use 0 to show all labels and 100 to show none. Use numbers between for a mix. Defaults to 0.
weight : str, optional
Turns weight on if weight variable name is added. ~ "gewicht". Defaults to 'auto'.
order_by : dict, optional
A dictionary. Orders the plot by any variable in the x data frame. ~ {'mean': 'asc'} / {1: 'desc'}. Defaults to None.
title_remove_before : str, optional
Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
title_remove_after : str, optional
Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark that. Defaults to ''.
tag_add_before : str, optional
Adds text before the tag text. ~ "Meine Zusatzinfo, ', " adds this string before tag. Defaults to ''.
tag_add_after : str, optional
Adds text after the tag text. ~ "Meine Zusatzinfo, ', " adds this string after tag. Defaults to ''.
subtitle_add_before : str, optional
Adds text before the tag text. ~ "Filter: F3 = 'Ja', " adds this string before tag. Defaults to ''.
subtitle_add_after : str, optional
Adds text after the tag text. ~ "Filter: F3 = 'Ja', " adds this string after tag. Defaults to ''.
value_labels_size : float, optional
Sets the size of value labels. Defaults to 999.
legend_labels_size : float, optional
Sets the size of legend. Defaults to 12.
hole_size : float, optional
defines the hole size in a donut chart. Defaults to 0.4.
show_legend : bool, optional
Turns legend on and off. ~ False. Defaults to True. show_legend_title (bool, optional): Toggles legend title. ~ True / False. Defaults to None, which means automatic.
show_count : bool, optional
Turns counts in variable labels on and off. ~ True. Defaults to False.
show_decimals : bool, optional
rounds the decimals of the labels to 1 if set to true. Defaults to False.
select_variable_levels : list[int], optional
A list variable levels used to select and order variables. ~ [2, 1, 99999997]. Defaults to None.
special_variables : list[int], optional
A list variable levels that will be treated differently. They will appear gray in the graph and have not effect on the mean calculation. ~ [16, 99999997]. Defaults to [96, 97, 98, 99999996, 99999997, 99999998].
standard_arguments : dict, optional
Adds some standard arguments that normally don't need to be changed. Defaults to None.
save_figure : dict, optional
Turns figure export on and off. Defaults to True.
return_figure : bool, optional
Turns figure return on and off. Defaults to False.
df : pd.DataFrame, optional
The current data file ~ usually df. Defaults to 'auto'.
meta : pyreadstat._readstat_parser.metadata_container, optional
The current meta data file ~ usually meta. Defaults to 'auto'.

Returns

Optional[tuple]
returns df and meta objects if return_data is True
def create_role_break(metatable: MetaTable2, original_variables: str, new_break_name: str = 'role_break', dont_know_code: bool = False) ‑> None

Creates a standard role break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_role_year_break(metatable: MetaTable2, original_variable: str, new_break_name: str = 'create_role_year_break', dont_know_code: bool = False) ‑> None

Creates a standard role year break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_sex_break(metatable: MetaTable2, original_variable: str = 'S11', new_break_name: str = 'sex_break', use_non_binary: bool = True, dont_know_code: bool = False) ‑> None

Creates a standard sex break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
use_non_binary : bool
Include "diverse" sex if True, only use binary sex otherwise
dont_know_code : bool
should the dont know code be recoded
def create_siedlungsart_break(metatable: MetaTable2, original_variable: str, original_variable_type: str, new_break_name: str, dont_know_code: bool = False) ‑> None

Creates a siedlungsart break based on a JSON mapping file.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
original_variable_type : str
Type of the original variable. "gdenr" if it represents the Gemeindenummer.
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_sprachregion_break(metatable: MetaTable2, original_variable: str, original_variable_type: str, new_break_name: str, dont_know_code: bool = False) ‑> None

Creates a sprachregion break based on a JSON mapping file.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
original_variable_type : str
Type of the original variable. "gdenr" if it represents the Gemeindenummer.
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_stacked_barchart(info: dict, variables: list, breaks: list = None, break_labels_rename: dict[str, str] = None, left_labels_rename: dict[str, str] = None, legend_labels_rename: dict[str, str] = None, left_labels_wrap: int = 50, left_labels_width: int = 999, show_mean: bool = False, title_custom_text: str = 'auto', subtitle_custom_text: str = 'auto', tag_custom_text: str = 'auto', title_position: int = 1, subtitle_position: int = 2, tag_position: int = 3, return_data: bool = False, color_theme: str = 'auto', color_type: str = 'diverging', color_direction: str = 'forward', color_custom: list[str] = None, value_labels_color: int = 999, value_labels_show_min: float = 0.8, weight: str = 'auto', order_by: dict = None, title_remove_before: str = '', title_remove_after: str = '', tag_add_before: str = '', tag_add_after: str = '', subtitle_add_before: str = '', subtitle_add_after: str = '', left_labels_remove_before: str = '', left_labels_remove_after: str = '', value_labels_add_after: str = '', bar_width: float = 999, break_distance: float = 999, label_size: float = 16, value_labels_size: float = 16, legend_labels_size: float = 14, show_legend: bool = True, show_total: bool = True, show_count: bool = False, show_index: bool = False, mean_name: str = 'MW', mean_transform: function = <function <lambda>>, mean_custom: list[str] = None, select_variable_levels: list[int] = None, select_variables: list[int] = None, select_min_count: int = 0, special_variables: list[int] = [96, 97, 98, 99999996, 99999997, 99999998], standard_arguments: dict = None, save_figure: bool = True, return_figure: bool = False, df: pandas.core.frame.DataFrame = 'auto', meta: pyreadstat._readstat_parser.metadata_container = 'auto')

Creates a stacked barchart

Args

info : dict
A dictionary with info about the presentation. Has info like number of slides, presentation name and the presentation object itself ~ usually info
df : pd.DataFrame
The current data file ~ usually df
meta : pyreadstat._readstat_parser.metadata_container
The current meta data file ~ usually meta
variables : list[str]
A list of variables. If multiple are added, function doesn't accept any breaks. ~ ['AW2'] / ['AW1_1', 'AW1_2', …]
breaks : list[str], optional
A list of breaks. ~ ['alter_break', 'sex_break']. Defaults to [].
break_labels_rename : dict[str, str], optional
Renames the breaks. ~ {'old_name': 'new_name'}. Defaults to {}.
left_labels_rename : dict[str, str], optional
Renames the labels of the different break levels. ~ {'old_name', 'new_name'}. Defaults to {}.
legend_labels_rename : dict[str, str], optional
Renames the labels of the variable in legend. ~ {'old_name', 'new_name'}. Defaults to {}.
left_labels_wrap : int, optional
Wraps the variable labels. Defaults to 50.
left_labels_width : int, optional
Changes the space allocated to the y-label. Defaults to 999 (which means automatic).
show_mean : bool, optional
Turns the mean annotations on. ~ True. Defaults to False.
title_custom_text : str, optional
Creates custom title text, if automatic text isn't desired. Defaults to "auto".
subtitle_custom_text : str, optional
Creates custom subtitle text, if automatic text isn't desired. Defaults to "auto".
tag_custom_text : str, optional
Creates custom tag text, if automatic text isn't desired. Defaults to "auto".
title_position : int, optional
Sets the position of the title text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 1.
subtitle_position : int, optional
Sets the position of the subtitle text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 2.
tag_position : int, optional
Sets the position of the tag text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 3.
return_data : bool, optional
Returns the data frame, that is used to construct the graph, so it can be viewed. The data frame will be printed out if the parameter is True. This doesn't affect the graph in any way. Defaults to False.
color_theme : str, optional
Sets color theme for graph. ~ 'test', 'gfs'. Defaults to 'standard'.
color_type : str, optional
Sets color value type. ~ 'groups', 'single_hue'. Defaults to 'diverging'.
color_direction : str, optional
Sets direction of color pattern. ~ 'backward'. Defaults to 'forward'.
value_labels_color : int, optional
Sets threshold for black or white label color. Use 0 for all black labels and 255 for all white. Use numbers between for a mix. Defaults to 130.
value_labels_show_min : float, optional
Sets threshold showing labels by %. Use 0 to show all labels and 100 to show none. Use numbers between for a mix. Defaults to 0.8.
weight : str, optional
Turns weight on if weight variable name is added. ~ "gewicht". Defaults to ''.
order_by : dict, optional
A dictionary. Orders the plot by any variable in the x data frame. ~ {'mean': 'asc'} / {1: 'desc'}. Defaults to {}.
title_remove_before : str, optional
Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
title_remove_after : str, optional
Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark that. Defaults to ''.
tag_add_before : str, optional
Adds text before the tag text. ~ "Meine Zusatzinfo, ', " adds this string before tag. Defaults to ''.
tag_add_after : str, optional
Adds text after the tag text. ~ "Meine Zusatzinfo, ', " adds this string after tag. Defaults to ''.
subtitle_add_before : str, optional
Adds text before the tag text. ~ "Filter: F3 = 'Ja', " adds this string before tag. Defaults to ''.
subtitle_add_after : str, optional
Adds text after the tag text. ~ "Filter: F3 = 'Ja', " adds this string after tag. Defaults to ''.
left_labels_remove_before : str, optional
Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
left_labels_remove_after : str, optional
Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark that. Defaults to ''.
value_labels_add_after : str, optional
Adds text to value labels. Is useful to add a % sign for example. Defaults to ''.
bar_width : float, optional
Sets the width of the bars. Defaults to 0.8.
break_distance : float, optional
Sets the distance between the breaks. Defaults to 0.5.
label_size : float, optional
Sets the size of labels. Defaults to 16.
value_labels_size : float, optional
Sets the size of value labels. Defaults to 16.
legend_labels_size : float, optional
Sets the size of legend. Defaults to 12.
show_legend : bool, optional
Turns legend on and off. ~ False. Defaults to True.
show_total : bool, optional
Turns total bar on and off. ~ False. Defaults to True.
show_count : bool, optional
Turns counts in variable labels on and off. ~ True. Defaults to False.
show_index : bool, optional
Turns index in variable labels on and off. ~ True. Defaults to False.
mean_name : str, optional
Changes name written above the mean values. Defaults to 'MW'.
mean_transform : LambdaType, optional
Changes the size of the mean values with a transformation. lambda n: -n + 2 changes n to the negative value of n and adds 2. Defaults to lambda n: n.
select_variable_levels : list[int], optional
A list variable levels used to select and order variables. ~ [2, 1, 99999997]. Defaults to [].
special_variables : list[int], optional
A list variable levels that will be treated differently. They will appear gray in the graph and have not effect on the mean calculation. ~ [16, 99999997]. Defaults to [96, 97, 98, 99999996, 99999997, 99999998].
standard_arguments : dict, optional
Adds some standard arguments that normally don't need to be changed. Defaults to None.
save_figure : bool, optional
Turns figure export on and off. ~ False. Defaults to True.
return_figure : bool, optional
Turns figure return on and off. ~ False. Defaults to True.

Returns

_type_
If return_data is on it returns the plot data, so it can be viewed.
def create_total_column(metatable: MetaTable2, name: str = 'tz') ‑> None

Creates the total column used in many standard tasks.

Args

metatable : MetaTable2
The Metatable that needs editing
name : str
name to overwrite the column name, Defaults to tz.
def create_winner_loser_break(metatable: MetaTable2, original_variable: str, new_break_name: str = 'winner_loser_break', dont_know_code: bool = False) ‑> None

Creates a standard winner/loser break.

Args

metatable : MetaTable2
The Metatable that needs editing
original_variable : str
Name of the original variable
new_break_name : str
Name of the newly created break
dont_know_code : bool
should the dont know code be recoded
def create_wordcloud(info: dict, variables: list[str], language_column: str, translate_to: int = 564, transform: str = 'no', font_name: str = 'arial', title_custom_text: str = 'auto', subtitle_custom_text: str = 'auto', tag_custom_text: str = 'auto', title_position: int = 1, subtitle_position: int = 2, tag_position: int = 3, return_data: bool = False, color_theme: str = 'auto', color_custom: str = 'auto', title_remove_before: str = '', title_remove_after: str = '', tag_add_before: str = '', tag_add_after: str = '', subtitle_add_before: str = '', subtitle_add_after: str = '', remove_words: Optional[None] = None, min_font_size: int = 0, min_count: int = 1, max_words: int = 200, translation_file_path: str = None, standard_arguments: dict = None, save_figure: bool = True, df: pandas.core.frame.DataFrame = 'auto', meta: pyreadstat._readstat_parser.metadata_container = 'auto') ‑> Optional[tuple]

generates a wordcloud plot, also translates open answers into a language

Args

info : dict
A dictionary with info about the presentation. Has info like number of slides, presentation name and the presentation object itself ~ usually info variables (list): A list of variables. If multiple are added, function doesn't accept any breaks. ~ ['AW2'] / ['AW1_1', 'AW1_2', …]
language_column : str
The column that contains the language information.
translate_to : int, optional
language code to translate to, matches nebu language code. Defaults to 564.
transform : str, optional
How to transform the words. ~ 'upper', 'lower', 'no'. Defaults to 'no'.
font_name : str, optional
font to use. Defaults to 'arial'.
title_custom_text : str, optional
Creates custom title text, if automatic text isn't desired.Defaults to 'auto'.
subtitle_custom_text : str, optional
Creates custom subtitle text, if automatic text isn't desired. Defaults to 'auto'.
tag_custom_text : str, optional
Creates custom tag text, if automatic text isn't desired. Defaults to 'auto'.
title_position : int, optional
Sets the position of the title text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 1.
subtitle_position : int, optional
Sets the position of the subtitle text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 2.
tag_position : int, optional
Sets the position of the tag text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 3.
return_data : bool, optional
Returns the data frame, that is used to construct the graph, so it can be viewed. The data frame will be printed out if the parameter is True. This doesn't affect the graph in any way. Defaults to False.
color_theme : str, optional
Sets color theme for graph. ~ 'test', 'gfs'. Defaults to 'auto'.
color_custom : str, optional
Uses custom colors ~ ['#131366', '#454578']. Defaults to None.
title_remove_before : str, optional
Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
title_remove_after : str, optional
Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark that. Defaults to ''.
tag_add_before : str, optional
Adds text before the tag text. ~ "Meine Zusatzinfo, ', " adds this string before tag. Defaults to ''.
tag_add_after : str, optional
Adds text after the tag text. ~ "Meine Zusatzinfo, ', " adds this string after tag. Defaults to ''.
subtitle_add_before : str, optional
Adds text before the tag text. ~ "Filter: F3 = 'Ja', " adds this string before tag. Defaults to ''.
subtitle_add_after : str, optional
Adds text after the tag text. ~ "Filter: F3 = 'Ja', " adds this string after tag. Defaults to ''.
remove_words : Optional[list[str]], optional
Words to remove in the wordcloud. Defaults to None.
min_font_size : int, optional
minimal font size to be displayed. Defaults to 0.
min_count : int, optional
minimum count of word frequency to be displayed. Defaults to 1.
max_words : int, optional
max word count to be displayed. Defaults to 200.
translation_file_path : str, optional
path to file to already translated texts. Defaults to None.
standard_arguments : dict, optional
Adds some standard arguments that normally don't need to be changed. Defaults to None.
save_figure : bool, optional
Turns figure export on and off. Defaults to True.
df : pd.DataFrame, optional
The current data file ~ usually df. Defaults to 'auto'.
meta : pyreadstat._readstat_parser.metadata_container, optional
The current meta data file ~ usually meta. Defaults to 'auto'.

Raises

errors
description

Returns

Optional[tuple]
description
def get_answer_list(df, column: str) ‑> list

Get a list of answers from a column in a data frame.

Args

df : pd.DataFrame
The data frame.
column : str
The column name.

Returns

list
A list of answers.
def get_word_frequencies(german_answers: list[str], transform: str, remove_words: Optional[None] = None, min_count: int = 1) ‑> dict

Get word frequencies from a list of german answers.

Args

german_answers : list
A list of german answers.
transform : str
How to transform the words. ~ 'upper', 'lower', 'no'
remove_words : list, optional
A list of words to remove. Defaults to None.
min_count : int, optional
The minimum count of a word to be included. Defaults to 1.
def load_and_render_panel_email(invitation_reason: str, end_of_study: str, study_code: str, length_min: float, points: Union[float, int] = 'auto', template_html: str = 'Vorlage_Mail_EinladungUmfrage_dt.html', output_file_path: str = 'Vorlage_Mail_EinladungUmfrage_dt.html', input_file_path: str = './templates/polittrends_invitation_templates/')

creates a panel invitation email by adding the invitationreason, the studynr and the length of the study to the panel templates, Points and CHF values are calculated from the length of the study, other languages can be chosen y using another template

Args

invitation_reason : str
Reason for Invitation
end_of_study : str
End of Study, as a string but should be a date (eg. 16. November 2023)
study_code : str
Code from Manageframes (eg: 999923528)
length_min : float
length of the study in minutes
points : Union[float, int]
points for the study, if 'auto' points are calculated from length_min
template_html : str
template file name
output_file_path : str
file name to be saved
input_file_path : str
folder for templates
def print_word_frequencies(df, column, min_count)

Prints word frequencies for a given column.

Args

df : pd.DataFrame
The data frame.
column : str
The column name.
min_count : int
The minimum count of a word to be included.
def save_presentation(info: dict) ‑> None

Saves the presentation

Args

info : dict
The presentation object
def show_stats(meta, variable)
def show_summary(meta, columns: list = [])
def start_presentation(name: str, df: pandas.core.frame.DataFrame, meta: pyreadstat._readstat_parser.metadata_container, output_path: str = './output/', template_path: str = './templates/', template_custom: str = 'auto', color_theme: str = 'standard', language: str = 'de', value_labels_color: int = 130, weight: str = '', line_width_linechart: int = 4, line_marker_size_linechart: int = 8, bar_width_barchart: float = 0.7, bar_width_stacked_barchart: float = 0.8, bar_gap_barchart: float = 0.15, break_distance: float = 0.5, count_name: str = 'N', logo_side: str = 'right', project_leader: str = None, project_people: list[str] = None, export_svg: bool = False, standard_arguments: dict = None, show_page_number: bool = False, use_wide_screen: bool = False) ‑> dict

Creates a new presentation

Args

name : str
Changes the name of the presentation
df : pd.DataFrame
The current data file ~ usually df
meta : pyreadstat._readstat_parser.metadata_container
The current meta data file ~ usually meta
output_path : str, optional
Changes path and name of the output files. Defaults to "./output/".
template_path : str, optional
Sets path of pptx-template file. Defaults to './templates/'.
template_custom : str, optional
Choose a custom template. Defaults to 'auto'.
color_theme : str, optional
Sets color theme for graph. ~ 'test', 'gfs'. Defaults to 'standard'.
language : str, optional
Sets language for texts like subtitle and tag. ~ 'fr', 'en', 'it'. Defaults to 'de'.
value_labels_color : int, optional
Sets threshold for black or white label color. Use 0 for all black labels and 255 for all white. Use numbers between for a mix. Defaults to 130.
weight : str, optional
Turns weight on if weight variable name is added. ~ "gewicht". Defaults to ''.
bar_width_barchart : float, optional
Sets the width of the barchart bars. Defaults to 0.7.
bar_width_stacked_barchart : float, optional
Sets the width of the stacked barchart bars. Defaults to 0.8.
bar_gap_barchart : float, optional
Sets the gap between the bars within a group. Defaults to 0.15.
break_distance : float, optional
Sets the distance between the breaks. Defaults to 0.5.
count_name : str, optional
Changes the name of the count. ~ if "K" is passed in, counts look like "K = 42". Defaults to 'N'.
logo_side : str, optional
Changes the position of the logo on the presentation slides. ~ 'left'. Defaults to 'right'.
project_leader : str
The first name of the project leader. Can be "Andreas" for example. Defaults to None.
project_people : list[str], optional
A list of the names of the people working on the project. For example ["Andreas", "Nadia"]. Defaults to None.
export_svg : bool, optional
Exports graphs as .svg files if True. Defaults to False.
standard_arguments : dict, optional
Adds some standard arguments that normally don't need to be changed. Defaults to {}.
show_page_number : bool, optional
Turns page number on PowerPoint on and off ~ True. Defaults to False.
use_wide_screen : bool, optional
Turns slides into 16:9 widescreen format ~ True. Defaults to False.

Returns

dict
Returns a dictionary with presentation info, the presentation object and some info about the plot theme.
def translate_column(info: dict, open_question_column: str, language_column: str, translate_to: int, df: pandas.core.frame.DataFrame, translation_file_path: str) ‑> None

Translates a column to a different language. Saves the translations to a JSON-file.

Args

info : dict
A dictionary with info about the presentation.
open_question_column : str
The column that should be translated.
language_column : str
The column that contains the language information.
translate_to : int
The language code to translate to.
df : pd.DataFrame
The data frame that contains the columns.
translation_file_path : str
file path to take the saved translations from.

Classes

class MatrixMixer (meta_table: MetaTable2, breaks: list = [['tz']], language: str = 'de', weight: str = 'tz', special_variables_last: bool = True, study_title: str = 'Studie', significance_level: float = 0.05, use_conf_interval: bool = False, conf_interval: float = 0.95, column_width: int = 15, hide_total_break: bool = False)

Assembles multiple tables with descriptive statistics and exports them to excel.

Args

meta_table : MetaTable2
MetaTable created using gfs_functions
breaks : list[str]
Selects the breaks used by default. Defaults to ['tz'].
language : str
Selects the language of the Matrix. Defaults to 'de'.
weight : str
Turns weight on if weight variable name is added. "gewicht", defaults to 'tz'
special_variables_last : bool
If True, special variables will be shown at the bottom. Defaults to True.
study_title : str
Title of the study. Defaults to 'Studie'.
use_conf_interval : bool
Show confidence intervals. Defaults to False.
conf_interval : float
change niveau of confidence interval. Defaults to .95.
column_width : int
Column width in Excel. Defaults to 15.
hide_total_break : bool
hides the "C" column, which is normally the "Total" break. Defaults to False.

Method generated by attrs for class MatrixMixer.

Class variables

var breaks : list
var column_width : int
var conf_interval : float
var hide_total_break : bool
var language : str
var meta_tableMetaTable2
var significance_level : float
var special_variables_last : bool
var study_title : str
var use_conf_interval : bool
var weight : str

Methods

def add_title_page_info(self, study_title: str, clients: list[str], survey_methodology: str, sampling: str, quota_features: str, address_origin: list[str], population: str, weighting: str, survey_period: str, random_sample: str = None)

sets title page info

Args

study_title : str
title of study
clients : list[str]
list of clients
survey_methodology : str
method used in study
sampling : str
sampling teqnique used
quota_features : str
feautures to quota was set on
address_origin : list[str]
origin of adresses
population : str
population of study
weighting : str
weighting variables or factors
survey_period : str
period in which the survey took place
random_sample : str
used to overwrite the random sample string, default is n = (number of recipients)
def create_matrix(self, question: str, breaks: list[list[str]] = 'auto', weight: str = 'auto', order_descending: Union[str, bool] = 'auto', special_variables: Union[str, list[int]] = 'auto', nps: bool = False, groupings: dict = None) ‑> None

Creates a single matrix and saves it to the MatrixMixer object.

Args

question : str
Name of question. Can either be of type Column or Group.
breaks : list[list[str]], optional
Selects the breaks used in matrix. Defaults to 'auto'.
weight : str, optional
Selects the weight used in matrix. Defaults to 'auto'.
order_descending : Union[str, bool], optional
Orders the values descending or not. Use either False or True. Defaults to 'auto'.
special_variables : Union[str, list[int]], optional
List of special variables which will appear at the bottom. Could be something like [96, 97, 98, 99999997, 99999998]. Defaults to 'auto'.
nps : bool, optional
add nps to the matrix. Defaults to False.
groupings : dict, optional
groupings dictionary to create variables boxes. an example is {"Top-Box (ja, eher ja)":[1, 2]}. Defaults to None.
def export_excel(self, show_title_page: bool = True, show_combined_sheet: bool = True, show_percentage_sheet: bool = True, show_absolute_sheet: bool = False, show_debugging_sheet: bool = False, table_name: str = None) ‑> None

Exports the matrices from the MatrixMixer to Excel.

Args

show_title_page : bool, optional
show a title page. Defaults to True.
show_combined_sheet : bool, optional
shows sheet with combined values. Defaults to True.
show_percentage_sheet : bool, optional
shows sheet with percentage values. Defaults to True.
show_absolute_sheet : bool, optional
shows sheet with absolute values. Defaults to False.
show_debugging_sheet : bool, optional
shows all sheets for debugging purposes. Defaults to False.
table_name : str, optional
Name of the exported table file. Defaults to None.
class MetaTable (df: pandas.core.frame.DataFrame, meta: pyreadstat._readstat_parser.metadata_container, project_name: str, year: str = '2023')

creates MetaTable object that makes data wrangling of files read with pyreadstat easy

Args

df : pd.DataFrame
DataFrame read with pyreadstat
meta : pyreadstat._readstat_parser.metadata_container
pyreadstat metadata
project_name : str
The name of the project. Is used as a folder name for the generated files
gfs_meta : dict
gfs specific meta data that is used in other gfs products

Methods

def create_column_copy(self, new_column: str, column: str, copy_values: bool)

Creates a new column that's a copy of another column in the MetaTable

Args

new_column : str
Name of the new column
column : str
Name of the column to be copied
copy_values : bool
If True it copies the dataframe values, if False, the column will be full of np.NaN
def create_empty_columns(self, columns: list[str], label: str, variable_labels: dict[int, str])
def export_config(self)

Exports an excel-file that makes changing the meta data very simple.

def import_config(self)

Imports the excel-file with the changed meta data and changes the MetaTable accordingly.

def recode(self, columns: Union[list[str], dict[str, str]], values: dict[int, typing.Any], keep_untouched_codes=True)

recodes chosen columns

Args

columns : Union[list[str], dict[str, str]]
can either be a list of the columns that should get new codes list ['F5_01', 'F5_02'] or a dictionary with columns that should be kept unchanged as keys and new columns as values like {'F5_01': 'F5_01_rec', 'F5_02': 'F5_02_rec'}
values : dict[int, Any]
a dictionary with new variable labels as keys and variable labels that should be recoded as values like {1: [1, 2], 2: range(3, 6), 3: 6}. This means that the old codes 1 and 2 become the new code 1, the codes 3, 4 and 5 become the new code 2, and the code 6 becomes the new code 3.
keep_untouched_codes : bool, optional
For example: When a code 7 exists, but is not changed with the "values" argument, it is kept as it was if True. If False, all answers with code 7 will set to Null and the label for the code 7 will be deleted. Defaults to True.
def rename_columns(self, columns: dict[str, str])

Renames the given columns.

Args

columns : dict[str, str]
Columns to rename: {'old_name': 'new_name'}
def return_components(self) ‑> tuple[pandas.core.frame.DataFrame, pyreadstat._readstat_parser.metadata_container, dict]

Returns the updated DataFrame, the metadata that are contained within the object and the gfs metadata

Returns

tuple[pd.DataFrame, pyreadstat._readstat_parser.metadata_container, dict]
returns tuple with objects
def scale_level(self, columns: list[str], new_scale: str)

Changes the scale levels of the given columns.

Args

columns : list[str]
A list of columns
new_scale : str
The new scale of the given columns. Can be 'nominal', 'scale' or 'ordinal'
def select_columns(self, columns: list)

selects the given columns and removes all others.

Args

columns : list
Columns to select
def show(self, columns: Union[list[str], str] = 'last changed', only_label: bool = False, show_objects: bool = False, total: bool = False)

Shows info about the value labels and variable label of the given columns.

Args

columns : Union[list[str], str]
Columns to be shown. If no column is passed in the last changed columns will be used. columns can either be a list of column names or a single column name as a string.
only_label : bool
Only shows the variable label
show_objects : bool
Also shows the python objects (makes copying easier)
total : bool
makes some changes to output for the show_all function. Not useful to the enduser.
def show_all(self, columns: Union[list[str], str] = 'last changed')

Shows info about the value labels and the meta data of the given columns.

Args

columns : Union[list[str], str]
Columns to be shown. If no column is passed in the last changed columns will be used. columns can either be a list of column names or a single column name as a string.
def show_meta(self, columns: Union[list[str], str] = 'last changed', total: bool = False)

Shows info about the meta data of the given columns.

Args

columns : Union[list[str], str]
Columns to be shown. If no column is passed in the last changed columns will be used. columns can either be a list of column names or a single column name as a string.
total : bool
makes some changes to output for the show_all function. Not useful to the enduser.
def val_lab(self, columns: list[str], labels: Union[dict[int, str], str], keep_untouched_codes=False)

Changes the value labels of the given columns

Args

columns : list[str]
A list of columns that need new value labels
labels : Union[dict[int, str], str]
A dictionary with new labels {1 : "label for code 1", 2: "label for code 2"} or the variable name with the labels to be used "variable_name"
keep_untouched_codes : bool, optional
This will keep the old labels of the column and just add the new ones instead of replacing all labels. Defaults to False.
def var_lab(self, columns: list[str], text: str)

Changes the variable labels of the given columns

Args

columns : list[str]
A list of columns that need a new variable label
text : str
Text of the variable label
def write_sav(self, path: str)

Writes a sav file of the MetaTable object which includes a Dataframe and the metadata.

Args

path : str
Path of the new file, includes the file name. To save the file in the package directory use "./filename.sav"
class MetaTable2 (df: pandas.core.frame.DataFrame, meta: pyreadstat._readstat_parser.metadata_container, project_number: int, project_name: str, columns: dict[str, Column] = _Nothing.NOTHING, groups: dict[str, Group] = _Nothing.NOTHING, weights: dict[str, Weight] = _Nothing.NOTHING, year: str = '2025', output_path: str = './output/', projects_path: str = './gfs_projects/')

Creates MetaTable object that makes data wrangling of files read with pyreadstat easy.

Args

df : pd.DataFrame
DataFrame read with pyreadstat
meta : pyreadstat._readstat_parser.metadata_container
pyreadstat metadata
project_number : int
The project number. Is used as a folder name for the generated files
project_name : str
The name of the project. Is used as a folder name for the generated files
columns : dict[str, Column]
metadata of all dataframe columns. Is part of gfs-meta
groups : dict[str, Group]
metadata of all column groups. Is part of gfs-meta
year : str
current year. Is used to choose the save folder
output_path : str, optional
Path of the output. Defaults to './output/'.
projects_path : str, optional
Path of the git projects folder. Defaults to './gfs_projects/'.

Method generated by attrs for class MetaTable2.

Class variables

var columns : dict[str, Column]
var df : pandas.core.frame.DataFrame
var groups : dict[str, Group]
var meta : pyreadstat._readstat_parser.metadata_container
var output_path : str
var project_name : str
var project_number : int
var projects_path : str
var weights : dict[str, Weight]
var year : str

Methods

def add_column_to_group(self, column: str, group: str)

Adds column to a group.

Args

column : str
Name of the column
group : str
Group to add the column
def add_to_group(self, column: str)

Adds info to the group that the column belongs to it.

Args

column : str
Name of the column
def calculate_weight(self, column: str, target_values: dict, return_df: bool = False)

Calculates the weights for a given weight column based on the target values. You need to create the weight before you can calculate it.

Args

column : str
Name of the weight column
target_values : dict
Dictionary with the target values for each intersection. This is a nested dictionary and it accepts absolute values. Can be like {1: {1: 20, 2: 25}, 2: {1: 40, 2: 50}}
return_df : bool
If True it will return the weighted dataframe. Defaults to False.
def check_duplicates(self, row: ) ‑> bool

function to check for duplicates in a row

Args

row : np.array
row to apply the function to

Returns

bool
if duplicates exist returns True, else False
def check_missing_columns(self, expected_columns: list)
def check_sav_prebreak(self, method: str = 'CATI', Interviewer_Column: str = 'ENQ2', Interview_Duration_Column: str = 'DURINT', Date_Column: str = 'DATE') ‑> None

checks and prints multiple key features of a sav file from nebu

Args

method : str, optional
string to indicate the method. Defaults to "CATI". possible values are "CATI" or "OTHER"
Interviewer_Column : str, optional
Interviewer Code Column. Defaults to "ENQ2".
Interview_Duration_Column : str, optional
Interview Duration Column. Defaults to "DURINT".
Date_Column : str, optional
Interview Date Column. Defaults to "DATE".
def copy_column(self, old_column: str, new_column: str, same_group: bool = True, add_to_group: bool = True)

Creates a copy of a column and gives it a new name.

Args

old_column : str
Name of the column to be copied
new_column : str
Name of the new column
same_group : bool, optional
If True the column will be added to the same group as the copied column. Defaults to True.
def copy_group(self, old_group: str, new_group: str)

Creates a copy of a column group and gives it a new name.

Args

old_group : str
Name of the group to be copied
new_group : str
Name of the new group
def create_column(self, column: str, label: str, value_labels: Union[dict[int, str], str, ForwardRef(None)] = None, measure: str = 'scale')

Creates a new column in the MetaTable and the DataFrame.

Args

column : str
Name of the column
label : str
Label of the column
value_labels : Union[dict[int, str], str], optional
Value labels of the column. Defaults to None.
measure : str, optional
Measure of the column. Defaults to 'scale'.
def create_group(self, group_name: str, columns: list[str], kind: str, measure: str = 'auto', lfm: str = 'yes', mean: str = 'auto', group_label: str = 'auto', group_value_labels: Union[dict[int, str], str] = 'auto', missing_values: Union[list[float], str] = 'auto')

Creates a new group in the MetaTable.

Args

group_name : str
Name of the group
columns : list[str]
List of columns that belong to the group
kind : str
Kind of the group. Can be 'multi' or 'batch'
measure : str, optional
Measure of the group. Can be 'string', 'nominal', 'scale' or 'ordinal'. Defaults to 'auto'.
lfm : str, optional
Decides if the value labels of to group should be used for all columns in the group. Can be 'yes' or 'no'. Defaults to 'yes'.
mean : str, optional
Decides if there is a useful mean value for a group of columns. Can be 'yes' or 'no'. Defaults to 'auto'.
group_label : str, optional
Label of the group. Defaults to 'auto'.
group_value_labels : Union[dict[int, str], str], optional
Value labels of the group. Defaults to 'auto'.
missing_values : Union[list[float], str], optional
List of missing values of the group. Defaults to 'auto'.
def create_weight(self, name: str, columns: list[str])

Creates a new weight based on the given columns.

Args

name : str
Name of the new weight
columns : list[str]
List of columns that should be used to calculate the weight.
def delete_column(self, column: str)

Deletes a column and the information about it from its group.

Args

column : str
Name of the column
def delete_group(self, group: str)

Deletes a column group and the information about it in every column.

Args

group : str
Name of the group
def encode(self, old_column: str, new_column: str, values: Optional[dict[str, int]] = None)

Encodes a column based on a dictionary with the new values.

Args

old_column : str
Name of the column
new_column : str
Name of the new column
values : Optional[dict[str, int]]
A dictionary with the new values and the old values that should be replaced. Can be {"yes": 1, "no": 2}. This will replace all values from "yes" to 1 and "no" to 2.
def export_coding_excel(self, column_lists: list[list[str]], filename: str = 'toCode', darker_columns: list = None, use_value_labels: bool = False) ‑> None

Exports a .xlsx-file with the given columns and their value labels. This is mostly used for coding open questions.

Args

column_lists : list[list[str]]
List of lists with the columns that should be exported. Can be [['CODERESP'], ['F1@', 'F1_01', 'F1_02', 'F1_03']]. All columns of every sublist will have the same background color.
filename : str
File name of the .xlsx file. Defaults to 'toCode'.
darker_columns : list
List of columns that should have a darker background color. Defaults to None.
use_value_labels : bool
If True it will display the value labels instead of the codes. Defaults to False.
def export_config(self, export_df: bool = True, gfs_config_name: str = 'gfs-config')

Exports an excel-file that makes changing the meta data very simple.

Args

export_df : bool
If True it will also export the data. Defaults to True.
def export_data(self, file_name: str = 'fertig') ‑> None

Exports a .SAV-file and a gfs-meta JSON-file.

Args

file_name : str
File name of the .sav file
def filter_label(self, column: str, filter_label: str)

Updates the filter_label of a column.

Args

column : str
Name of the column
filter_label : str
New filter_label of the column. This label adds information about the filter that was used for that question in the questionnaire.
def get_intersection_counts(self, categorical_columns: list[str])

Get the count for all combinations of the given columns.

Args

categorical_columns : list[str]
List of columns that should be used to calculate the intersection counts.
def group_filter_label(self, group: str, filter_label: str)

Updates the filter_label of a group of columns.

Args

group : str
Name of the group
filter_label : str
New filter_label of the group. This label adds information about the filter that was used for that question in the questionnaire.
def group_has_mean(self, group: str, mean: str)

Updates if there is a useful mean value for a group of columns.

Args

group : str
Name of the group
mean : str
New mean state of the group. Should be "yes" or "no"
def group_kind(self, group: str, kind: str)

Updates the kind of a group of columns.

Args

group : str
Name of the group
kind : str
New kind of the group. Should be "multi", "single" or "batch"
def group_label(self, group: str, text: str, verbose: bool = True)

Updates the group_label of a group of kind = "batch" or "multi".

Args

group : str
Name of the group of columns
text : str
New text of the group_label
verbose : bool
Prints warnings if True. Defaults to True.
def group_lfm(self, group: str, lfm: str)

Updates the lfm (label from group) of a group of columns.

Args

group : str
Name of the group
lfm : str
New lfm of the group. Should be "yes" or "no"
def group_measure(self, group: str, measure: str)

Updates the measure of a group of columns.

Args

group : str
Name of the group
measure : str
New measure of the group. Should be "nominal", "string", "scale" or "ordinal"
def group_missing_values(self, group: str, missing_values: list[float])

Updates the missing values of a group of columns.

Args

group : str
Name of the group
missing_values : list[float]
New missing values of the group.
def group_value_labels(self, group: str, value_labels: Union[dict[int, str], str], keep_untouched_codes: bool = False)

Updates the value labels of a group of columns.

Args

group : str
Name of the group
value_labels : Union[dict[int, str], str]
A dictionary with new labels {1 : "label for code 1", 2: "label for code 2"} or the column name with the labels to be used "column_name"
keep_untouched_codes : bool, optional
This will keep the old labels of the column and just add the new ones instead of replacing all labels. Defaults to False.
def has_mean(self, column: str, mean: str)

Updates if there is a useful mean value for a column.

Args

column : str
Name of the column
mean : str
New mean state of the group. Should be "yes" or "no"
def import_config(self, gfs_config_name: str = 'gfs-config') ‑> None

Imports the gfs-config excel-file and updates the MetaTable according to the changes made in excel.

Args

gfs_config_name : str, optional
name of config if it should not be default or multiple configs are used. Defaults to "gfs-config".

Raises

FileNotFoundError
if a config file is not found
ValueError
description
def item_label(self, column: str, text: str, verbose: bool = True)

Updates the item_label of a variable of kind = "batch".

Args

column : str
Name of the column
text : str
New text of the item_label
verbose : bool
Prints warnings if True. Defaults to True.
def kind(self, column: str, kind: str)

Updates the kind of a column.

Args

column : str
Name of the column
kind : str
New kind of the column. Should be "multi", "single" or "batch"
def make_quota_check(self, columns: list[str], filename_quotas: str = 'cross_tab', filename_quota_check: str = 'quota_check', calc_quota_difference: bool = False, save_quota_check: bool = False)

Calculates the difference between a crosstab and a quota

Args

columns : list[str]
list of dataframe columns in crosstab
filename_quotas : str, optional
name of the excel file where the crosstab is
filename_quota_check : str, optional
name of the excel file where the difference in quotas is saved
calc_quota_difference : bool, optional
boolean to indicate if difference in quota is calculated
save_quota_check : bool, optional
boolean to indicate if difference in quota is saved in an excel file
def measure(self, column: str, measure: str)

Updates the measure of a column.

Args

column : str
Name of the column
measure : str
New measure of the column. Should be "nominal", "string", "scale" or "ordinal"
def merge_open_questions(self, df_open_questions: pandas.core.frame.DataFrame, columns: list, code_list: dict, group_name: str, group_label: str = '', merge_Id: str = 'CODERESP', group_kind='multi', measure: str = 'auto', check_for_duplicates: bool = True) ‑> None

merges open questions with the metatable dataframe

Args

df_open_questions : pd.DataFrame
open question dataframe
columns : list
list of columns to merge (normally a group)
code_list : dict
dictionary with the new code list (used for group value labels)
group_name : str
group name to use
group_label : str, optional
Label for the group. Defaults to "".
merge_Id : str
id to merge columns on, defaults to CODERESP
group_kind : str, optional
kind of group. Defaults to 'multi'.
measure : str, optional
measure of the group. Defaults to 'auto'.
check_for_duplicates : True, optional
check duplicates overrule parameter, duplicates are not checked if set to False. Defaults to 'True'.
def merge_semiopen_questions(self, df_semiopen_questions: pandas.core.frame.DataFrame, columns: list, code_list: dict, group_name: str, merge_Id: str = 'CODERESP', check_for_duplicates: bool = True) ‑> None

merges semi open questions with the metatable dataframe

Args

df_semiopen_questions : pd.DataFrame
semiopen questions dataframe
columns : list
list of columns to merge (normally a group)
code_list : dict
dictionary with the new code list (used for group value labels)
group_name : str
group name to use
merge_Id : str
id to merge columns on, defaults to CODERESP
check_for_duplicates : True, optional
check duplicates overrule parameter, duplicates are not checked if set to False. Defaults to 'True'.
def missing_values(self, column: str, missing_values: list[float])

Updates the missing values of a column.

Args

column : str
Name of the column
missing_values : list[float]
New missing values of the column.
def move_column(self, column: str, end: bool = True)

Moves a column to the beginning or the end of the MetaTable

Args

column : str
Column to be moved
end : bool, optional
If end is True the column is moved to the end, if end is False the column is moved to the beginning. Defaults to True.
def move_columns(self, column_order: list)

Moves columns based on the desired order in the MetaTable.

Args

column_order : list
The desired column order.
def randomise_divers_gender(self, gender_column='S11', divers_values: list = [3], seed: int = 12345)

randomises the divers gender value to either 1 or 2 with a change of 50/50, asserts that 1 and 2 are male and female values

Args

gender_column : str, optional
column name which has the values for gender. Defaults to "S11".
divers_values (list(int), optional): values which equals to divers labels, if multiple are given the \
randomisation is executed for each label sequentially. Defaults to 3.
seed : int, optional
randomised seed, should normally not be changed. Defaults to 12345.
def recode(self, old_column: str, new_column: str, values: dict[int, typing.Any], keep_untouched_codes: bool = True)

Recodes a column based on a dictionary with the new values.

Args

old_column : str
Name of the column
new_column : str
Name of the new column
values : dict[int, Any]
A dictionary with the new values and the old values that should be replaced Can be {1: range(1, 20), 2: [20, 21], 3: 22}. This will replace all values from 1 to 19 with 1, 20 and 21 with 2 and 22 with 3.
keep_untouched_codes : bool
This will keep the old labels of the column and just add the new ones instead of replacing all labels. Defaults to True.
def recode_group(self, old_group: str, new_group: str, values: dict[int, typing.Any], keep_untouched_codes: bool = True)

Recodes a group of columns based on a dictionary with the new values.

Args

old_group : str
Name of the group
new_group : str
Name of the new group
values : dict[int, Any]
A dictionary with the new values and the old values that should be replaced Can be {1: range(1, 20), 2: [20, 21], 3: 22}. This will replace all values from 1 to 19 with 1, 20 and 21 with 2 and 22 with 3.
keep_untouched_codes : bool
This will keep the old labels of the column and just add the new ones instead of replacing all labels. Defaults to True.
def remove_from_group(self, column: str)

Removes a column from a column group.

Args

column : str
Name of the column
def remove_speeders(self, speeder_value: float = None, Interview_Duration_Column: str = 'DURINT') ‑> None

Remove speeder rows from the DataFrame where interview duration is below the calculated speeder threshold.

Args

speeder_value : float
The precalculated speeder threshold value. If not provided, it will be calculated using _calculate_speeder_value.
Interview_Duration_Column : str
Name of the column containing interview durations. Default is "DURINT".
def rename_column(self, name: str, new_name: str)

Renames a column.

Args

name : str
Old name of the column
new_name : str
New name of the column
def rename_group(self, group: str, new_group_name: str)

renames a group

Args

group : str
group to rename
new_group_name : str
new group name
def select_columns(self, columns: list)

Selects columns and removes the others

Args

columns : list
Names of the columns to select
def show_column_info(self, column: str, show_objects: bool = False)

Shows info about the value labels and variable label of the given column.

Args

column : str
Column to be shown
show_objects : bool
if True it prints lists value_labels
def show_column_meta(self, column: str)

Shows info about the meta data of the given column.

Args

column : str
Column to be shown
def show_crosstab(self, columns: list[str], save_crosstab: bool = False, cross_tab_name: str = 'cross_tab', drop_na: bool = False, show_margins: bool = True)

Creates and shows a crosstab with a set of row and a set of column breaks

Args

columns : list[str]
list of dataframe columns in crosstab
save_crosstab : bool, optional
boolean to indicate if crosstab is saved in an excel file
cross_tab_name : str, optional
name of the excel file
drop_na : bool
if True it doesn't show rows and columns if all of their values are zero
show_margins : bool
Shows the total of rows and columns if True
def show_group_info(self, group: str)

Shows info about the given group.

Args

group : str
The name of a group of columns
def show_group_meta(self, group: str)

Shows info about the meta data of the given group.

Args

group : str
The name of a group of columns
def single_label(self, column: str, text: str, verbose: bool = True)

Updates the label of a variable of kind = "single".

Args

column : str
Name of the column
text : str
New text of the label
verbose : bool
Prints warnings if True. Defaults to True.
def value_labels(self, column: str, value_labels: Union[dict[int, str], str], keep_untouched_codes: bool = False)

Updates the value labels of a column.

Args

column : str
Name of the column
value_labels : Union[dict[int, str], str]
A dictionary with new labels {1 : "label for code 1", 2: "label for code 2"} or the column name with the labels to be used "column_name"
keep_untouched_codes : bool, optional
This will keep the old labels of the column and just add the new ones instead of replacing all labels. Defaults to False.
class gfs_Maps (info: dict, variables: list, map_scope: Map_Scope_Enum, map_level: Map_LOD_Enum, breaks: list = None, exclusion_filter=[], subplot_columns: int = 3, language: str = 'de', disable_hover_info: bool = False, background_map_color: str = '#999', projection: str = 'mercator', show_mean: bool = False, show_count: bool = False, show_total: bool = False, show_legend: bool = True, legend_labels_rename: dict[str, str] = None, left_labels_rename: dict[str, str] = None, break_labels_rename: dict[str, str] = None, legend_break: list = None, order_by: dict = None, special_variables: list[int] = [96, 97, 98, 99999996, 99999997, 99999998], standard_arguments: dict = None, title_custom_text: str = 'auto', subtitle_custom_text: str = 'auto', tag_custom_text: str = 'auto', legend_title_custom_text: str = 'auto', title_position: int = 1, subtitle_position: int = 2, color_theme: str = 'auto', color_type: str = 'single_hue', color_direction: str = 'backward', color_custom: list[str] = None, legend_position: str = 'right', legend_labels_size: float = 14, weight: str = 'auto', select_variable_levels: list[int] = None, tag_position: int = 3, left_labels_wrap: int = 50, colorbar_length: float = 0.7, colorbar_x: float = 0.9, colorbar_y: float = 0.5, tag_add_before: str = '', tag_add_after: str = '', subtitle_add_before: str = '', subtitle_add_after: str = '', left_labels_remove_before: str = '', left_labels_remove_after: str = '', title_remove_before: str = '', title_remove_after: str = '', label_size: int = 16, select_min_count: int = 0, save_figure: bool = True, df: pandas.core.frame.DataFrame = 'auto', meta: pyreadstat._readstat_parser.metadata_container = 'auto')

Creates a map chart

Args

info : dict
A dictionary with info about the presentation. Has info like number of slides, presentation name and the presentation object itself ~ usually info
variables : list
A list of variables. If multiple are added, function doesn't accept any breaks. ~ ['AW2'] /['AW1_1', 'AW1_2', …]
map_scope : Map_Scope_Enum
defines a preset for the scope of the map eg. EU with all its countries etc.
map_level : Map_LOD_Enum
defines level of detail for the map. eg. metropolitan areas, country or others
breaks : list, optional
A list of breaks. ~ ['alter_break', 'sex_break']. Defaults to None.
exclusion_filter : list
a list of string objects representing NUTS_ID's to exlude in the maps, example: ["CH025", "CH022", "CH024"], see: https://ec.europa.eu/eurostat/web/nuts/nuts-maps for the available ID's
subplot_columns : int
count of subplotcolumns to use if there are multiple maps
language : str
language to label the shapes, must be present in translation json
disable_hover_info : bool
disables hover info on shapes
background_map_color : str
hex color code to set the background color of map shapes, default is "#999"
projection : str
projection type as a string or None ('equirectangular', 'mercator', 'orthographic', 'natural earth', 'kavrayskiy7', 'miller', 'robinson', 'eckert4', 'azimuthal equal area', 'azimuthal equidistant', 'conic equal area', 'conic conformal', 'conic equidistant', 'gnomonic', 'stereographic', 'mollweide', 'hammer', 'transverse mercator', 'albers usa', 'winkel tripel', 'aitoff' and 'sinusoidal')
show_mean : bool, optional
Turns uses the mean of the variable as x-value and changes scale type from percentage to mean. ~ True. Defaults to False.
show_count : bool, optional
Turns counts in variable labels on and off. ~ True. Defaults to False.
show_total : bool, optional
Turns total bar on and off. ~ False. Defaults to True.
show_legend : bool, optional
Turns legend on and off. ~ False. Defaults to True.
legend_labels_rename : dict[str, str], optional
Renames the labels of the different legend levels. ~ {'old_name', 'new_name'}. Defaults to None.
left_labels_rename : dict[str, str], optional
Renames the labels of the different break levels. ~ {'old_name', 'new_name'}. Defaults to None.
break_labels_rename : dict[str, str], optional
Renames the breaks. ~ {'old_name': 'new_name'}. Defaults to None.
legend_break : list, optional
A list of breaks. This is used to create the legend break if either "mean" == True or "select_variable_levels" has exactly one item ~ ['alter_break'] / [['alter_break']]. Defaults to None.
order_by : dict, optional
A dictionary. Orders the plot by any variable in the x data frame. ~ {'mean': 'asc'} / {1: 'desc'}. Defaults to None.
special_variables : list[int], optional
A list variable levels that will be treated differently. They will appear gray in the graph and have not effect on the mean calculation. ~ [16, 99999997]. Defaults to [96, 97, 98, 99999996, 99999997, 99999998].
standard_arguments : dict, optional
Adds some standard arguments that normally don't need to be changed. Defaults to None.
title_custom_text : str, optional
Creates custom title text, if automatic text isn't desired. Defaults to 'auto'.
subtitle_custom_text : str, optional
Creates custom subtitle text, if automatic text isn't desired. Defaults to 'auto'.
tag_custom_text : str, optional
Creates custom tag text, if automatic text isn't desired. Defaults to 'auto'. legend_title_custom_text (str, optional): Creates a custom legend title, if title is desired. Defaults to 'auto'.
legend_title_custom_text : str, optional
Creates a custom legend title, if title is desired. Defaults to 'auto'.
title_position : int, optional
Sets the position of the title text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 1.
subtitle_position : int, optional
Sets the position of the subtitle text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 2.
color_theme : str, optional
Sets color theme for graph. ~ 'test', 'gfs'. Defaults to 'auto'.
color_type : str, optional
Sets color value type. ~ 'diverging', 'single_hue'. Defaults to 'groups'.
color_direction : str, optional
Sets direction of color pattern. ~ 'backward'. Defaults to 'forward'.
color_custom : list[str], optional
Uses custom colors ~ ['#131366', '#454578']. Defaults to None. value_labels_show_min (float, optional): Sets threshold showing labels by %. Use 0 to show all labels and 100 to show none. Use numbers between for a mix. Defaults to 0.
legend_position : str, optional
Sets the position of the legend. ~ 'bottom'. Defaults to 'right'.
legend_labels_size : float, optional
Sets the size of legend. Defaults to 14.
weight : str, optional
Turns weight on if weight variable name is added. ~ "gewicht". Defaults to 'auto'.
select_variable_levels : list[int], optional
A list variable levels used to select and order variables. ~ [2, 1, 99999997]. Defaults to None.
tag_add_before : str, optional
Adds text before the tag text. ~ "Meine Zusatzinfo, ', " adds this string before tag. Defaults to ''.
tag_add_after : str, optional
Adds text after the tag text. ~ "Meine Zusatzinfo, ', " adds this string after tag. Defaults to ''.
tag_position : int, optional
Sets the position of the tag text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 3.
left_labels_wrap : int, optional
Wraps the variable labels. Defaults to 50.
colorbar_length : float, optional
scales the colorbar length. Defaults to 0.7.
colorbar_x : float, optional
changes the x coordinate of the colorbar. Defaults to 0.9.
colorbar_y : float, optional
changes the y coordinate of the colorbar. Defaults to 0.5.
subtitle_add_before : str, optional
Adds text before the tag text. ~ "Filter: F3 = 'Ja', " adds this string before tag. Defaults to ''.
subtitle_add_after : str, optional
Adds text after the tag text. ~ "Filter: F3 = 'Ja', " adds this string after tag. Defaults to ''.
left_labels_remove_before : str, optional
Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
left_labels_remove_after : str, optional
Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark. Defaults to ''.
title_remove_before : str, optional
Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
title_remove_after : str, optional
Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark that. Defaults to ''.
label_size : float, optional
Sets the size of labels. Defaults to 16.
select_min_count : int, optional
Selects all bars with higher count than given. ~ 10 means that only bars with more than 10 observations will be selected. Defaults to 0.

Methods

def clip_coordinates(self, unclipped_shapes: Map_Scope_Enum)

clips coordinates of background to a preset for the scope of the map and returns the background shape

Args

scope : Map_Scope_Enum
Enum to define the preset (EU or CH)
def create_map_figure(self, foreground_shapes: geopandas.geodataframe.GeoDataFrame, clipped_foreground_shapes: geopandas.geodataframe.GeoDataFrame)

creates map figures and combines background and foreground shapes with each other

Args

foreground_shapes (:gpd.GeoDataFrame):geopandas Dataframe with the foreground shapes data clipped_foreground_shapes (:gpd.GeoDataFrame):geopandas Dataframe with the clipped foreground shapes data

Returns

go.Figure
plotly Figure with map objects
def create_maps(self) ‑> Optional[tuple]

Creates a map chart Returns: Optional[tuple]: returns df and meta objects if return_data is True

def create_maps_df(self) ‑> pandas.core.frame.DataFrame

creates dataframe for a map graphic

Returns: pd.DataFrame: returns pandas dataframe for map

def get_foreground_shapes(self, resolution='03') ‑> geopandas.geodataframe.GeoDataFrame

gets foreground shape for the maps

Args

resolution : str
resolution size of the standard "NUTS" shapefiles - possible values are 03,10,20, default is 03

Returns: gpd.GeoDataFrame: returns filtered geopandas dataframe by level and scope

def get_translations(self) ‑> str

gets translation json for maps and map objects for the map scope Returns: str: json string with all map objects

def init_shape_file(self, resolution, file=None) ‑> geopandas.geodataframe.GeoDataFrame

reads shapefile from the file system and loads it into a geopandas dataframe

Args

resolution : str
resolution size of the standard "NUTS" shapefiles - possible values are 03,10,20
file : str
file path to your own shapefile if you need it

Returns: gpd.GeoDataFrame: returns geopandas dataframe

def shape_clipping(self, foreground_shapes: geopandas.geodataframe.GeoDataFrame, exclusion_filter: list) ‑> geopandas.geodataframe.GeoDataFrame

removes shapes with a list of NUTS ID objects

Args

foreground_shapes : gpd.GeoDataFrame
shapes to be used in the map plot

exclusion_filter (list(str)): list of NUTS ID objects to remove Returns: gpd.GeoDataFrame: returns clipped geopandas dataframe