Package `functions`

gfs_functions

This repository is a part of the technology stack of gfs-zürich. It contains functions to create graphs and tables with python.

Project objectives

streamline preprocessing of data files
streamline table creation for data files
streamline graphic creation for data files
create presentations

Technologies

Python
Jupyter Notebooks
SPSS Datafiles
Excel
JSON
API
Pandas
…

Requirements

Installed Python 3.10.6 on your local machine
…

Getting-Started

Clone this repository (for help see this tutorial).
Install or update your Virtual Environment by following this section Pipenv virtual environment
…

Virtual-Environment

Install Pipenv Package

In this project, we are using Pipenv for dependency management. Thus, you need the pipenv CLI tool installed on your computer:

open cmd or powershell
pip install pipenv

First install of Environment

As soon this package is installed, you can install all required packages for this project by executing the following command inside the root project folder:

open cmd or powershell
cd /your/local/github/repofolder/
pipenv install
restart VSCode
choose the newly created gfs_functions virtual environment python interpreter (for help see this tutorial

Environment already installed (update dependencies)

If your environment exists and you only want to update the dependencies use these steps:

open cmd or powershell
cd /your/local/github/repofolder/
pipenv sync

Add new Dependency

If you need another dependency, not yet defined in the Pipfile, you can install it using this command and it will also be added to the dependency list.

open cmd or powershell
cd /your/local/github/repofolder/
pipenv install <package>

Featured Files

File & Folder Structure

+---.vscode            # VS-Code settings
+---data               # Data Folder with example files
+---docs               # Automatically generated documentation files
+---functions          # Main function folder
|   +---graphy         # Functions for graph creation
|   +---matrixmixer    # Functions for table creation
|   +---metatables     # Functions for SPSS file data preprocessing
|   \---tools          # Functions for general tasks
+---gfs_projects       # Repository with project files
+---output             # Folder for outputs
+---resources          # Folder with resources for maps
+---templates          # Folder with all template files
|   +---fonts          # .ttf files with project fonts
|   +---logo           # gfs logo files
|   +---mapping        # JSON files with standard variable mappings
|   +---powerpoint     # Templates of Powerpoint presentations
|   +---shapefiles     # Shapefiles for maps
|   \---translations   # JSON files used for language translations
+---test               # test functions
+---uml                # UML diagrams for documentation

Add documentation

The documentation will be added to the gfs-zurich.github.io repository. To get the documentation into the repository, clone it into the ./docs folder as functions. Then run the following command in the root folder of this repository to create the updated documentation:

pdoc --html --output-dir docs functions --config show_source_code=False -f

or use (be sure to install draw.io.exe locally)

pipenv run .\documentation.bat

UML

Workflow für Metatable / Datapreprocessing

Removing Jupyter notebook output when committing to a Git repository

You can remove the Jupyter notebook output without your interaction by using a Git hook. Here are the steps to set up a pre-commit hook to automatically remove the output from the Jupyter notebook cells before committing the code:

Open your terminal and navigate to the Git repository where your Jupyter notebook is located.
Create a new file called pre-commit in the .git/hooks directory of the repository by running the following command:


type nul > .git/hooks/pre-commit

Open the pre-commit file in a text editor and add the following code:


#!/bin/bash

# Find Jupyter notebook files that have been changed
changed_notebooks=$(git diff --cached --name-only --diff-filter=ACM | grep '.ipynb$')

# If no notebooks have been changed, exit with a success status
if [ -z "$changed_notebooks" ]; then
  exit 0
fi

# Remove output from changed notebooks
for notebook in $changed_notebooks; do
  jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace "$notebook"
done

# Add changes to the commit
git add $changed_notebooks

# Exit with a success status
exit 0

Save and close the file.

Now, every time you run the git commit command, the pre-commit script will automatically remove the output from the Jupyter notebook cells before committing the code, without any further interaction required from you. By modifying the pre-commit script in this way, you can ensure that the Jupyter notebook output is removed only from the notebooks that are being committed, and not from all notebooks in the repository.

Sub-modules

functions.aioli
functions.graphy: gfs-graphy …
functions.matrixmixer: gfs-matrixmixer …
functions.metatables: gfs-metatables …
functions.tools: gfs-tools These are tools for various tasks inside the technology stack of gfs-zürich.

Functions

def add_combined_graphs_slide(info: dict, figures: list, title_text: str, subtitle_text: str, tag_text: str, group_images: bool = True, share_y_labels: bool = True) ‑> None

combines a list of plotly figures onto a grid and saves it as a powerpoint slied

Args

info : dict: The presentation object
figures : list: list of plotly figures, make sure to use single instances of them, in doubt make a deepcopy
title_text : str: title_text
subtitle_text : str: subtitle_text
tag_text : str: tag_text
group_images : bool, optional: determines whether images are grouped in the slide. Defaults to True.
share_y_labels : bool, optional: determines whether images share the y axis labels row-wise. Defaults to True.

Raises

ValueError: if an invalid number of figures is entered, only 2 to 9 figures are valid

def add_table_slide(info: dict, title_text: str, table: pandas.core.frame.DataFrame, table_width: float = 22.5, table_height: float = 5.0, table_top_distance: float = 4.5, table_left_distance: float = 1.5, column_titles: Optional[None] = None, column_alignments: Optional[None] = None, column_widths: Optional[None] = None, bold_cells: Optional[None] = None, font_size: int = 14, row_heights: Optional[None] = None, cells_with_custom_color: Optional[None] = None, show_table_background_color: bool = True, title_font_size: int = 20, table_font_name: str = 'Leelawadee UI', cells_with_hyperlinks: Optional[None] = None) ‑> None

Adds a table slide to the presentation

Args

info : dict: The presentation object
title_text : str: The title of the slide
table : pd.DataFrame: The table that should be added to the slide
table_width : float, optional: The width of the table. Defaults to 22.5.
table_height : float, optional: The height of the table. Defaults to 5.0.
column_alignments : Union[None, list[str]], optional: The alignment of the columns. Defaults to None. If None, all columns will be aligned left. If a list is passed, it should contain the alignment for each column in the table. The alignment can be 1 (left), 2 (center), or 3 (right).
column_titles : Union[None, list[str]], optional: The titles of the columns. Defaults to None. If None, the table will not have column titles. If a list is passed, it should contain the title for each column in the table.
column_widths : Union[None, list[int]], optional: The width of the columns. Defaults to None. If None, all columns will have the same width. If a list is passed, it should contain the width for each column in the table. The values should be integers and only the relative width is important.
bold_cells : Union[None, list[tuple[int, int]]], optional: The cells that should be bold. Defaults to None. If None, no cells will be bold. If a list is passed, it should contain the row and column index of the cells that should be bold.
font_size : int, optional: The font size of the text in the table. Defaults to 14.
row_heights : Union[None, list[float]], optional: The height of the rows. Defaults to None. If None, all rows will have the automatic height. If a list is passed, it should contain the height for each row in the table.
cells_with_custom_color : Union[None, dict], optional: The cells that should have a custom color. Defaults to None. If None, no cells will have a custom color. If a dictionary is passed, it should contain the row and column index of the cells as keys and the color as values. The color should be a string with the hex code of the color.
show_table_background_color : bool, optional: Determines if the table background color should be shown. Defaults to True.
title_font_size : int, optional: The font size of the title. Defaults to 20.
table_font_name : str, optional: The name of the font used in the whole table.
cells_with_hyperlinks : Union[None, dict], optional: The cells that should have a hyperlink. Defaults to None. If None, no cells will have a hyperlink. If a dictionary is passed, it should contain the row and column index of the cells as keys and the hyperlink as values. The hyperlink should be a string with the address of the hyperlink.

def add_text_slide(info: dict, text: Union[list[dict], list[str]], my_title: str = 'gfs-zürich, Markt- & Sozialforschung', vertical_alignment: str = 'top') ‑> None

Adds a text slide to the presentation

Args

info : dict: The presentation object
text : Union[list[dict], list[str]]: The text that should be added to the slide. If a list of dictionaries is passed, each dictionary should contain a "text" key with the text that should be added to the slide. Additionally, the dictionary can contain a "size" key with the font size of the text, a "bold" key with a boolean value that determines if the text should be bold, a "hyperlink" key that adds a link as the value, a "alignment" key that changes the text alignment (left, center, right) and a "color" key with the color of the text. If a list of strings is passed, each string will be added to the slide as a paragraph.
my_title : str, optional: The title of the slide. Defaults to 'gfs-zürich, Markt- & Sozialforschung'.
vertical_alignment : str, optional: The vertical alignment of the text. Defaults to 'top'.

def create_age_break(metatable: MetaTable2, original_variable: str = 'S12_1', new_break_name: str = 'age_break', dont_know_code: bool = False) ‑> None

Creates a standard age break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_apartment_search_break(metatable: MetaTable2, original_variable: str = 'WOHNUNGSSUCHE', new_break_name: str = 'apartment_search_break', dont_know_code: bool = False) ‑> None

Creates a standard apartment_search break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_apartment_search_simple_break(metatable: MetaTable2, original_variable: str = 'WOHNUNGSSUCHE', new_break_name: str = 'apartment_search_break', dont_know_code: bool = False) ‑> None

Creates a standard apartment_search break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_barchart(info: dict, variables: list, breaks: Optional[list] = None, legend_break: Optional[list] = None, barchart_mean: bool = False, break_labels_rename: Optional[dict[str, str]] = None, left_labels_rename: Optional[dict[str, str]] = None, legend_labels_rename: Optional[dict[str, str]] = None, left_labels_wrap: int = 50, left_labels_width: int = 999, show_mean: bool = False, title_custom_text: str = 'auto', subtitle_custom_text: str = 'auto', tag_custom_text: str = 'auto', legend_title_custom_text: str = 'auto', title_position: int = 1, subtitle_position: int = 2, tag_position: int = 3, legend_position: str = 'right', return_data: bool = False, color_theme: str = 'auto', color_type: str = 'groups', color_direction: str = 'forward', color_custom: Optional[None] = None, value_labels_show_min: float = 0, value_labels_add_after: Optional[list[list[str]]] = None, value_labels_transform: function = <function <lambda>>, weight: str = 'auto', order_by: Optional[None] = None, title_remove_before: str = '', title_remove_after: str = '', title_add_end: str = '', tag_add_before: str = '', tag_add_after: str = '', subtitle_add_before: str = '', subtitle_add_after: str = '', left_labels_remove_before: str = '', left_labels_remove_after: str = '', bar_gap: float = 999, bar_width: float = 999, bar_min_size: float = 0, break_distance: float = 999, label_size: float = 16, value_labels_size: float = 999, legend_labels_size: float = 14, show_legend: bool = True, show_total_legend: bool = True, show_total: bool = True, show_count: bool = False, show_index: bool = False, show_count_legend: bool = False, show_index_legend: bool = False, show_all_bars: bool = False, show_legend_title: Optional[bool] = None, select_variable_levels: Optional[None] = None, select_variables: Optional[None] = None, select_min_count: int = 0, special_variables: list[int] = [96, 97, 98, 99999996, 99999997, 99999998], standard_arguments: Optional[None] = None, save_figure: bool = True, return_figure: bool = False, df: Union[pandas.core.frame.DataFrame, str] = 'auto', meta: pyreadstat._readstat_parser.metadata_container = 'auto') ‑> Optional[tuple]

Creates a barchart

Args

info : dict: A dictionary with info about the presentation. Has info like number of slides, presentation name and the presentation object itself ~ usually info
variables : list: A list of variables. If multiple are added, function doesn't accept any breaks. ~ ['AW2'] / ['AW1_1', 'AW1_2', …]
breaks : list, optional: A list of breaks. ~ ['alter_break', 'sex_break']. Defaults to None.
legend_break : list, optional: A list of breaks. This is used to create the legend break if either "mean" == True or "select_variable_levels" has exactly one item ~ ['alter_break'] / [['alter_break']]. Defaults to None.
barchart_mean : bool, optional: Toggles barchart type mean, where mean value of variable is shown. Defaults to False.
break_labels_rename : dict[str, str], optional: Renames the breaks. ~ {'old_name': 'new_name'}. Defaults to None.
left_labels_rename : dict[str, str], optional: Renames the labels of the different break levels. ~ {'old_name', 'new_name'}. Defaults to None.
legend_labels_rename : dict[str, str], optional: Renames the labels of the different legend levels. ~ {'old_name', 'new_name'}. Defaults to None.
left_labels_wrap : int, optional: Wraps the variable labels. Defaults to 50.
left_labels_width : int, optional: Changes the space allocated to the y-label. Defaults to 999 (which means automatic).
show_mean : bool, optional: Turns uses the mean of the variable as x-value and changes scale type from percentage to mean. ~ True. Defaults to False.
title_custom_text : str, optional: Creates custom title text, if automatic text isn't desired.Defaults to 'auto'.
subtitle_custom_text : str, optional: Creates custom subtitle text, if automatic text isn't desired. Defaults to 'auto'.
tag_custom_text : str, optional: Creates custom tag text, if automatic text isn't desired. Defaults to 'auto'.
legend_title_custom_text : str, optional: Creates a custom legend title, if title is desired. Defaults to 'auto'.
title_position : int, optional: Sets the position of the title text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 1.
subtitle_position : int, optional: Sets the position of the subtitle text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 2.
tag_position : int, optional: Sets the position of the tag text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 3.
legend_position : str, optional: Sets the position of the legend. ~ 'bottom'. Defaults to 'right'.
return_data : bool, optional: Returns the data frame, that is used to construct the graph, so it can be viewed. The data frame will be printed out if the parameter is True. This doesn't affect the graph in any way. Defaults to False.
color_theme : str, optional: Sets color theme for graph. ~ 'test', 'gfs'. Defaults to 'auto'.
color_type : str, optional: Sets color value type. ~ 'diverging', 'single_hue'. Defaults to 'groups'.
color_direction : str, optional: Sets direction of color pattern. ~ 'backward'. Defaults to 'forward'.
color_custom : list[str], optional: Uses custom colors ~ ['#131366', '#454578']. Defaults to None.
value_labels_show_min : float, optional: Sets threshold showing labels by %. Use 0 to show all labels and 100 to show none. Use numbers between for a mix. Defaults to 0.
value_labels_add_after : list[list[str]], optional: Adds text after the value labels ~ [['text', 'text'], ['text', 'text']]. Defaults to None.
weight : str, optional: Turns weight on if weight variable name is added. ~ "gewicht". Defaults to 'auto'.
order_by : dict, optional: A dictionary. Orders the plot by any variable in the x data frame. ~ {'mean': 'asc'} / {1: 'desc'}. Defaults to None.
title_remove_before : str, optional: Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
title_remove_after : str, optional: Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark that. Defaults to ''.
title_add_end : str, optional: Appends given sting to the label text. Defaults to ''.
tag_add_before : str, optional: Adds text before the tag text. ~ "Meine Zusatzinfo, ', " adds this string before tag. Defaults to ''.
tag_add_after : str, optional: Adds text after the tag text. ~ "Meine Zusatzinfo, ', " adds this string after tag. Defaults to ''.
subtitle_add_before : str, optional: Adds text before the tag text. ~ "Filter: F3 = 'Ja', " adds this string before tag. Defaults to ''.
subtitle_add_after : str, optional: Adds text after the tag text. ~ "Filter: F3 = 'Ja', " adds this string after tag. Defaults to ''.
left_labels_remove_before : str, optional: Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
left_labels_remove_after : str, optional: Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark. Defaults to ''.
bar_gap : int, optional: Sets the gap between the bars within a group. Defaults to 0.15.
bar_width : float, optional: Sets the width of the bars. Defaults to 0.7.
bar_min_size : float, optional: Sets the minimum size of the show bars. Only affects plot if show_all_bars is set to "False". Defaults to 0.
break_distance : float, optional: Sets the distance between the breaks. Defaults to 0.5
label_size : float, optional: Sets the size of labels. Defaults to 16.
value_labels_size : float, optional: Sets the size of value labels. Defaults to 999.
value_labels_transform : LambdaType, optional: Changes the size of the value_labels values with a transformation. lambda n: -n + 2 changes n to the negative value of n and adds 2. Defaults to lambda n: n.
legend_labels_size : float, optional: Sets the size of legend. Defaults to 14.
show_legend : bool, optional: Turns legend on and off. ~ False. Defaults to True.
show_total_legend : bool, optional: Turns Total in legend on and off. ~ False. Defaults to True.
show_total : bool, optional: Turns total bar on and off. ~ False. Defaults to True.
show_count : bool, optional: Turns counts in variable labels on and off. ~ True. Defaults to False.
show_index : bool, optional: Turns index in variable labels on and off. ~ True. Defaults to False.
show_count_legend : bool, optional: Turns counts in legend labels on and off. ~ True. Defaults to False.
show_index_legend : bool, optional: Turns index in legend labels on and off. ~ True. Defaults to False.
show_all_bars : bool, optional: Turns counts bars with zero observations on and off. ~ True. Defaults to False.
show_legend_title : bool, optional: Toggles legend title. ~ True / False. Defaults to None, which means automatic.
select_variable_levels : list[int], optional: A list variable levels used to select and order variables. ~ [2, 1, 99999997]. Defaults to None.
select_min_count : int, optional: Selects all bars with higher count than given. ~ 10 means that only bars with more than 10 observations will be selected. Defaults to 0.
special_variables : list[int], optional: A list variable levels that will be treated differently. They will appear gray in the graph and have not effect on the mean calculation. ~ [16, 99999997]. Defaults to [96, 97, 98, 99999996, 99999997, 99999998].
standard_arguments : dict, optional: Adds some standard arguments that normally don't need to be changed. Defaults to None.
save_figure : dict, optional: Turns figure export on and off. Defaults to True.
return_figure : bool, optional: Turns figure return on and off. Defaults to False.
df : pd.DataFrame, optional: The current data file ~ usually df. Defaults to 'auto'.
meta : pyreadstat._readstat_parser.metadata_container, optional: The current meta data file ~ usually meta. Defaults to 'auto'.

Returns

Optional[tuple]: returns df and meta objects if return_data is True

def create_canton_break(metatable: MetaTable2, original_variable: str, original_variable_type: str, new_break_name: str, dont_know_code: bool = False) ‑> None

Creates a canton break based on a JSON mapping file.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
original_variable_type : str: Type of the original variable. "gdenr" if it represents the Gemeindenummer.
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_cantonal_banc_break(metatable: MetaTable2, original_variable: str = 'KANTONALBANK', new_break_name: str = 'kantonalbank_break', dont_know_code: bool = False) ‑> None

Creates a standard kantonalbank break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_count_employee_break(metatable: MetaTable2, original_variable: str, new_break_name: str = 'count_employee_break', dont_know_code: bool = False) ‑> None

Creates a standard count employee break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break

def create_custom_break(metatable: MetaTable2, original_variable: str, new_break_name: str, recode_values: tuple[dict[int, range], bool], value_labels: dict[int, str], single_label: str, measure: str = 'nominal') ‑> None

Creates a custom break in the specified Metatable.

Args

metatable : MetaTable2: The Metatable that needs editing.
original_variable : str: Name of the original variable.
new_break_name : str: Name of the newly created break.
recode_values : Tuple[Dict[int, range], bool]: Tuple with recode values and whether to keep untouched codes, e.g., ({1: range(1, 6), 2: range(6, 11)}, True).
value_labels : Dict[int, str]: Value labels dictionary, e.g., {1: '1-5', 2: '6-10'}.
single_label : str: Single label of the break.
measure : str: Break measure, nominal, scale, etc. Default is 'nominal'.

def create_education_break(metatable: MetaTable2, original_variable: str = 'S15', new_break_name: str = 'education_break', dont_know_code: bool = False) ‑> None

Creates a standard education break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_employment_break(metatable: MetaTable2, original_variable: str = 'S13', new_break_name: str = 'employment_break', dont_know_code: bool = False) ‑> None

Creates a standard employment break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_founding_year_break(metatable: MetaTable2, original_variable: str, new_break_name: str = 'founding_year_break', dont_know_code: bool = False) ‑> None

Creates a standard founding year break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_gdenr_from_plz(metatable: MetaTable2, original_variable: str, new_variable: str, dont_know_code: bool = False) ‑> None

Creates Gemeindenummer from PLZ based on a JSON mapping file.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_variable : str: Name of the newly created variable
dont_know_code : bool: should the dont know code be recoded

def create_gemeindegrösse_break(metatable: MetaTable2, original_variable: str, original_variable_type: str, new_break_name: str, dont_know_code: bool = False) ‑> None

Creates a gemeindegrösse break based on a JSON mapping file.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
original_variable_type : str: Type of the original variable. "gdenr" if it represents the Gemeindenummer.
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_gender_break(metatable: MetaTable2, original_variable: str = 'S11', new_break_name: str = 'gender_break', use_non_binary: bool = True, dont_know_code: bool = False) ‑> None

Creates a standard gender break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
use_non_binary : bool: Include "divers" gender if True, only use binary gender otherwise
dont_know_code : bool: should the dont know code be recoded

def create_grossregion_break(metatable: MetaTable2, original_variable: str, original_variable_type: str, new_break_name: str, dont_know_code: bool = False) ‑> None

Creates a grossregion break based on a JSON mapping file.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
original_variable_type : str: Type of the original variable. "gdenr" if it represents the Gemeindenummer.
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_income_break(metatable: MetaTable2, original_variable: str = 'S14', new_break_name: str = 'income_break', dont_know_code: bool = False) ‑> None

Creates a standard income break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_industry_sector_break(metatable: MetaTable2, original_variable: str, new_break_name: str = 'industry_sector_break', dont_know_code: bool = False, values: dict = {}, value_labels: dict = {}) ‑> None

Creates a standard industry sector (Branche, NOGA Code) break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded
values : dict: dictionary with industry sector values (NOGA)
value_labels : dict: dictionary with industry sector value labels (NOGA)

def create_lead_time_break(metatable: MetaTable2, original_variable: str, new_break_name: str = 'lead_time_break', year: int = 2025, dont_know_code: bool = False) ‑> None

Creates a standard lead time break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_linechart(info: dict, variables: list, breaks: Optional[list] = None, legend_break: Optional[list] = None, linechart_mean: bool = False, break_labels_rename: Optional[dict[str, str]] = None, left_labels_rename: Optional[dict[str, str]] = None, legend_labels_rename: Optional[dict[str, str]] = None, left_labels_wrap: int = 50, left_labels_width: int = 999, show_mean: bool = False, title_custom_text: str = 'auto', subtitle_custom_text: str = 'auto', tag_custom_text: str = 'auto', legend_title_custom_text: str = 'auto', title_position: int = 1, subtitle_position: int = 2, tag_position: int = 3, legend_position: str = 'right', return_data: bool = False, color_theme: str = 'auto', color_type: str = 'groups', color_direction: str = 'forward', color_custom: Optional[None] = None, value_labels_show_min: float = 0, value_labels_add_after: Optional[list[list[str]]] = None, weight: str = 'auto', order_by: Optional[None] = None, title_remove_before: str = '', title_remove_after: str = '', title_add_end: str = '', tag_add_before: str = '', tag_add_after: str = '', subtitle_add_before: str = '', subtitle_add_after: str = '', left_labels_remove_before: str = '', left_labels_remove_after: str = '', line_width: int = 999, line_marker_size: int = 999, break_distance: float = 999, label_size: float = 16, value_labels_size: float = 999, legend_labels_size: float = 14, show_legend: bool = True, show_total_legend: bool = True, show_total: bool = True, show_count: bool = False, show_index: bool = False, show_count_legend: bool = False, show_index_legend: bool = False, show_legend_title: Optional[bool] = None, select_variable_levels: Optional[None] = None, select_min_count: int = 0, break_x_axis: bool = False, special_variables: list[int] = [96, 97, 98, 99999996, 99999997, 99999998], standard_arguments: Optional[None] = None, save_figure: bool = True, return_figure: bool = False, df: Union[pandas.core.frame.DataFrame, str] = 'auto', meta: pyreadstat._readstat_parser.metadata_container = 'auto') ‑> Optional[tuple]

Creates a linechart

Args

info : dict: A dictionary with info about the presentation. Has info like number of slides, presentation name and the presentation object itself ~ usually info
variables : list: A list of variables. If multiple are added, function doesn't accept any breaks. ~ ['AW2'] / ['AW1_1', 'AW1_2', …]
breaks : list, optional: A list of breaks. ~ ['alter_break', 'sex_break']. Defaults to None.
legend_break : list, optional: A list of breaks. This is used to create the legend break if either "mean" == True or "select_variable_levels" has exactly one item ~ ['alter_break'] / [['alter_break']]. Defaults to None.
linechart_mean : bool, optional: Toggles linechart type mean, where mean value of variable is shown. Defaults to False.
break_labels_rename : dict[str, str], optional: Renames the breaks. ~ {'old_name': 'new_name'}. Defaults to None.
left_labels_rename : dict[str, str], optional: Renames the labels of the different break levels. ~ {'old_name', 'new_name'}. Defaults to None.
legend_labels_rename : dict[str, str], optional: Renames the labels of the different legend levels. ~ {'old_name', 'new_name'}. Defaults to None.
left_labels_wrap : int, optional: Wraps the variable labels. Defaults to 50.
left_labels_width : int, optional: Changes the space allocated to the y-label. Defaults to 999 (which means automatic).
show_mean : bool, optional: Turns uses the mean of the variable as x-value and changes scale type from percentage to mean. ~ True. Defaults to False.
title_custom_text : str, optional: Creates custom title text, if automatic text isn't desired.Defaults to 'auto'.
subtitle_custom_text : str, optional: Creates custom subtitle text, if automatic text isn't desired. Defaults to 'auto'.
tag_custom_text : str, optional: Creates custom tag text, if automatic text isn't desired. Defaults to 'auto'.
legend_title_custom_text : str, optional: Creates a custom legend title, if title is desired. Defaults to 'auto'.
title_position : int, optional: Sets the position of the title text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 1.
subtitle_position : int, optional: Sets the position of the subtitle text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 2.
tag_position : int, optional: Sets the position of the tag text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 3.
legend_position : str, optional: Sets the position of the legend. ~ 'bottom'. Defaults to 'right'.
return_data : bool, optional: Returns the data frame, that is used to construct the graph, so it can be viewed. The data frame will be printed out if the parameter is True. This doesn't affect the graph in any way. Defaults to False.
color_theme : str, optional: Sets color theme for graph. ~ 'test', 'gfs'. Defaults to 'auto'.
color_type : str, optional: Sets color value type. ~ 'diverging', 'single_hue'. Defaults to 'groups'.
color_direction : str, optional: Sets direction of color pattern. ~ 'backward'. Defaults to 'forward'.
color_custom : list[str], optional: Uses custom colors ~ ['#131366', '#454578']. Defaults to None.
value_labels_show_min : float, optional: Sets threshold showing labels by %. Use 0 to show all labels and 100 to show none. Use numbers between for a mix. Defaults to 0.
value_labels_add_after : list[list[str]], optional: Adds text after the value labels ~ [['text', 'text'], ['text', 'text']]. Defaults to None.
weight : str, optional: Turns weight on if weight variable name is added. ~ "gewicht". Defaults to 'auto'.
order_by : dict, optional: A dictionary. Orders the plot by any variable in the x data frame. ~ {'mean': 'asc'} / {1: 'desc'}. Defaults to None.
title_remove_before : str, optional: Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
title_remove_after : str, optional: Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark that. Defaults to ''.
title_add_end : str, optional: Appends given sting to the label text. Defaults to ''.
tag_add_before : str, optional: Adds text before the tag text. ~ "Meine Zusatzinfo, ', " adds this string before tag. Defaults to ''.
tag_add_after : str, optional: Adds text after the tag text. ~ "Meine Zusatzinfo, ', " adds this string after tag. Defaults to ''.
subtitle_add_before : str, optional: Adds text before the tag text. ~ "Filter: F3 = 'Ja', " adds this string before tag. Defaults to ''.
subtitle_add_after : str, optional: Adds text after the tag text. ~ "Filter: F3 = 'Ja', " adds this string after tag. Defaults to ''.
left_labels_remove_before : str, optional: Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
left_labels_remove_after : str, optional: Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark. Defaults to ''.
bar_gap : int, optional: Sets the gap between the bars within a group. Defaults to 0.15.
bar_width : float, optional: Sets the width of the bars. Defaults to 0.7.
bar_min_size : float, optional: Sets the minimum size of the show bars. Only affects plot if show_all_bars is set to "False". Defaults to 0.
break_distance : float, optional: Sets the distance between the breaks. Defaults to 0.5
label_size : float, optional: Sets the size of labels. Defaults to 16.
value_labels_size : float, optional: Sets the size of value labels. Defaults to 999.
legend_labels_size : float, optional: Sets the size of legend. Defaults to 14.
show_legend : bool, optional: Turns legend on and off. ~ False. Defaults to True.
show_total_legend : bool, optional: Turns Total in legend on and off. ~ False. Defaults to True.
show_total : bool, optional: Turns total bar on and off. ~ False. Defaults to True.
show_count : bool, optional: Turns counts in variable labels on and off. ~ True. Defaults to False.
show_index : bool, optional: Turns index in variable labels on and off. ~ True. Defaults to False.
show_count_legend : bool, optional: Turns counts in legend labels on and off. ~ True. Defaults to False.
show_index_legend : bool, optional: Turns index in legend labels on and off. ~ True. Defaults to False.
show_all_bars : bool, optional: Turns counts bars with zero observations on and off. ~ True. Defaults to False.
show_legend_title : bool, optional: Toggles legend title. ~ True / False. Defaults to None, which means automatic.
select_variable_levels : list[int], optional: A list variable levels used to select and order variables. ~ [2, 1, 99999997]. Defaults to None.
select_min_count : int, optional: Selects all bars with higher count than given. ~ 10 means that only bars with more than 10 observations will be selected. Defaults to 0.
break_x_axis : bool, optional: Moves the breaks to the x Axis and the left labels to the legend, if you have only one variable this happens per default, mostly used for time-series data. Defaults to False.
special_variables : list[int], optional: A list variable levels that will be treated differently. They will appear gray in the graph and have not effect on the mean calculation. ~ [16, 99999997]. Defaults to [96, 97, 98, 99999996, 99999997, 99999998].
standard_arguments : dict, optional: Adds some standard arguments that normally don't need to be changed. Defaults to None.
save_figure : dict, optional: Turns figure export on and off. Defaults to True.
return_figure : bool, optional: Turns figure return on and off. Defaults to False.
df : pd.DataFrame, optional: The current data file ~ usually df. Defaults to 'auto'.
meta : pyreadstat._readstat_parser.metadata_container, optional: The current meta data file ~ usually meta. Defaults to 'auto'.

Returns

Optional[tuple]: returns df and meta objects if return_data is True

def create_market_change_break(metatable: MetaTable2, original_variable: str, new_break_name: str = 'market_change_break', dont_know_code: bool = False) ‑> None

Creates a standard market change break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_method_break(metatable: MetaTable2, original_variable: str = 'method_break', new_break_name: str = 'method_break', other_category_break: str = 'Andere', dont_know_code: bool = False) ‑> None

Creates a standard method break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
other_category_break : str: Name of the "Others" category
dont_know_code : bool: should the dont know code be recoded

def create_ownership_break(metatable: MetaTable2, original_variable: str = 'EIGENTUEMERMIETER', new_break_name: str = 'ownership_break', dont_know_code: bool = False) ‑> None

Creates a standard ownership break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_piechart(info: dict, variables: list, breaks: list = None, break_labels_rename: dict[str, str] = None, legend_labels_rename: dict[str, str] = None, legend_position: str = 'bottom', legend_title_custom_text: str = 'auto', title_custom_text: str = 'auto', subtitle_custom_text: str = 'auto', tag_custom_text: str = 'auto', title_position: int = 1, subtitle_position: int = 2, tag_position: int = 3, return_data: bool = False, color_theme: str = 'auto', color_type: str = 'groups', color_direction: str = 'forward', color_custom: list[str] = None, value_labels_color: int = 999, value_labels_show_min: float = 0.8, weight: str = 'auto', order_by: dict = None, title_remove_before: str = '', title_remove_after: str = '', tag_add_before: str = '', tag_add_after: str = '', subtitle_add_before: str = '', subtitle_add_after: str = '', value_labels_size: float = 16, legend_labels_size: float = 12, hole_size: float = 0.4, show_legend: bool = True, show_legend_title: Optional[bool] = None, show_count: bool = False, show_decimals: bool = False, select_variable_levels: list[int] = None, special_variables: list[int] = [96, 97, 98, 99999996, 99999997, 99999998], standard_arguments: dict = None, save_figure: bool = True, return_figure: bool = False, df: pandas.core.frame.DataFrame = 'auto', meta: pyreadstat._readstat_parser.metadata_container = 'auto')

Creates a piechart or donutchart, depends on hole size parameter

Args

info : dict: A dictionary with info about the presentation. Has info like number of slides, presentation name and the presentation object itself ~ usually info
variables : list: A list of variables. If multiple are added, function doesn't accept any breaks. ~ ['AW2'] / ['AW1_1', 'AW1_2', …]
breaks : list, optional: A list of breaks. ~ ['alter_break', 'sex_break']. Defaults to None.
break_labels_rename : dict[str, str], optional: Renames the breaks. ~ {'old_name': 'new_name'}. Defaults to None.
legend_labels_rename : dict[str, str], optional: Renames the labels of the different legend levels. ~ {'old_name', 'new_name'}. Defaults to None.
legend_position : str, optional: Sets the position of the legend. ~ 'bottom'. Defaults to 'right'. legend_title_custom_text (str, optional): Creates a custom legend title, if title is desired. Defaults to 'auto'.
title_custom_text : str, optional: Creates custom title text, if automatic text isn't desired.Defaults to 'auto'.
subtitle_custom_text : str, optional: Creates custom subtitle text, if automatic text isn't desired. Defaults to 'auto'.
tag_custom_text : str, optional: Creates custom tag text, if automatic text isn't desired. Defaults to 'auto'.
title_position : int, optional: Sets the position of the title text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 1.
subtitle_position : int, optional: Sets the position of the subtitle text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 2.
tag_position : int, optional: Sets the position of the tag text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 3.
return_data : bool, optional: Returns the data frame, that is used to construct the graph, so it can be viewed. The data frame will be printed out if the parameter is True. This doesn't affect the graph in any way. Defaults to False.
color_theme : str, optional: Sets color theme for graph. ~ 'test', 'gfs'. Defaults to 'auto'.
color_type : str, optional: Sets color value type. ~ 'diverging', 'single_hue'. Defaults to 'groups'.
color_direction : str, optional: Sets direction of color pattern. ~ 'backward'. Defaults to 'forward'.
color_custom : list[str], optional: Uses custom colors ~ ['#131366', '#454578']. Defaults to None.
value_labels_color : int, optional: Sets threshold for black or white label color. Use 0 for all black labels and 255 for all white. Use numbers between for a mix. Defaults to 999.
value_labels_show_min : float, optional: Sets threshold showing labels by %. Use 0 to show all labels and 100 to show none. Use numbers between for a mix. Defaults to 0.
weight : str, optional: Turns weight on if weight variable name is added. ~ "gewicht". Defaults to 'auto'.
order_by : dict, optional: A dictionary. Orders the plot by any variable in the x data frame. ~ {'mean': 'asc'} / {1: 'desc'}. Defaults to None.
title_remove_before : str, optional: Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
title_remove_after : str, optional: Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark that. Defaults to ''.
tag_add_before : str, optional: Adds text before the tag text. ~ "Meine Zusatzinfo, ', " adds this string before tag. Defaults to ''.
tag_add_after : str, optional: Adds text after the tag text. ~ "Meine Zusatzinfo, ', " adds this string after tag. Defaults to ''.
subtitle_add_before : str, optional: Adds text before the tag text. ~ "Filter: F3 = 'Ja', " adds this string before tag. Defaults to ''.
subtitle_add_after : str, optional: Adds text after the tag text. ~ "Filter: F3 = 'Ja', " adds this string after tag. Defaults to ''.
value_labels_size : float, optional: Sets the size of value labels. Defaults to 999.
legend_labels_size : float, optional: Sets the size of legend. Defaults to 12.
hole_size : float, optional: defines the hole size in a donut chart. Defaults to 0.4.
show_legend : bool, optional: Turns legend on and off. ~ False. Defaults to True. show_legend_title (bool, optional): Toggles legend title. ~ True / False. Defaults to None, which means automatic.
show_count : bool, optional: Turns counts in variable labels on and off. ~ True. Defaults to False.
show_decimals : bool, optional: rounds the decimals of the labels to 1 if set to true. Defaults to False.
select_variable_levels : list[int], optional: A list variable levels used to select and order variables. ~ [2, 1, 99999997]. Defaults to None.
special_variables : list[int], optional: A list variable levels that will be treated differently. They will appear gray in the graph and have not effect on the mean calculation. ~ [16, 99999997]. Defaults to [96, 97, 98, 99999996, 99999997, 99999998].
standard_arguments : dict, optional: Adds some standard arguments that normally don't need to be changed. Defaults to None.
save_figure : dict, optional: Turns figure export on and off. Defaults to True.
return_figure : bool, optional: Turns figure return on and off. Defaults to False.
df : pd.DataFrame, optional: The current data file ~ usually df. Defaults to 'auto'.
meta : pyreadstat._readstat_parser.metadata_container, optional: The current meta data file ~ usually meta. Defaults to 'auto'.

Returns

Optional[tuple]: returns df and meta objects if return_data is True

def create_role_break(metatable: MetaTable2, original_variables: str, new_break_name: str = 'role_break', dont_know_code: bool = False) ‑> None

Creates a standard role break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_role_year_break(metatable: MetaTable2, original_variable: str, new_break_name: str = 'create_role_year_break', dont_know_code: bool = False) ‑> None

Creates a standard role year break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_sex_break(metatable: MetaTable2, original_variable: str = 'S11', new_break_name: str = 'sex_break', use_non_binary: bool = True, dont_know_code: bool = False) ‑> None

Creates a standard sex break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
use_non_binary : bool: Include "diverse" sex if True, only use binary sex otherwise
dont_know_code : bool: should the dont know code be recoded

def create_siedlungsart_break(metatable: MetaTable2, original_variable: str, original_variable_type: str, new_break_name: str, dont_know_code: bool = False) ‑> None

Creates a siedlungsart break based on a JSON mapping file.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
original_variable_type : str: Type of the original variable. "gdenr" if it represents the Gemeindenummer.
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_sprachregion_break(metatable: MetaTable2, original_variable: str, original_variable_type: str, new_break_name: str, dont_know_code: bool = False) ‑> None

Creates a sprachregion break based on a JSON mapping file.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
original_variable_type : str: Type of the original variable. "gdenr" if it represents the Gemeindenummer.
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_stacked_barchart(info: dict, variables: list, breaks: list = None, break_labels_rename: dict[str, str] = None, left_labels_rename: dict[str, str] = None, legend_labels_rename: dict[str, str] = None, left_labels_wrap: int = 50, left_labels_width: int = 999, show_mean: bool = False, title_custom_text: str = 'auto', subtitle_custom_text: str = 'auto', tag_custom_text: str = 'auto', title_position: int = 1, subtitle_position: int = 2, tag_position: int = 3, return_data: bool = False, color_theme: str = 'auto', color_type: str = 'diverging', color_direction: str = 'forward', color_custom: list[str] = None, value_labels_color: int = 999, value_labels_show_min: float = 0.8, weight: str = 'auto', order_by: dict = None, title_remove_before: str = '', title_remove_after: str = '', tag_add_before: str = '', tag_add_after: str = '', subtitle_add_before: str = '', subtitle_add_after: str = '', left_labels_remove_before: str = '', left_labels_remove_after: str = '', value_labels_add_after: str = '', bar_width: float = 999, break_distance: float = 999, label_size: float = 16, value_labels_size: float = 16, legend_labels_size: float = 14, show_legend: bool = True, show_total: bool = True, show_count: bool = False, show_index: bool = False, mean_name: str = 'MW', mean_transform: function = <function <lambda>>, mean_custom: list[str] = None, select_variable_levels: list[int] = None, select_variables: list[int] = None, select_min_count: int = 0, special_variables: list[int] = [96, 97, 98, 99999996, 99999997, 99999998], standard_arguments: dict = None, save_figure: bool = True, return_figure: bool = False, df: pandas.core.frame.DataFrame = 'auto', meta: pyreadstat._readstat_parser.metadata_container = 'auto')

Creates a stacked barchart

Args

info : dict: A dictionary with info about the presentation. Has info like number of slides, presentation name and the presentation object itself ~ usually info
df : pd.DataFrame: The current data file ~ usually df
meta : pyreadstat._readstat_parser.metadata_container: The current meta data file ~ usually meta
variables : list[str]: A list of variables. If multiple are added, function doesn't accept any breaks. ~ ['AW2'] / ['AW1_1', 'AW1_2', …]
breaks : list[str], optional: A list of breaks. ~ ['alter_break', 'sex_break']. Defaults to [].
break_labels_rename : dict[str, str], optional: Renames the breaks. ~ {'old_name': 'new_name'}. Defaults to {}.
left_labels_rename : dict[str, str], optional: Renames the labels of the different break levels. ~ {'old_name', 'new_name'}. Defaults to {}.
legend_labels_rename : dict[str, str], optional: Renames the labels of the variable in legend. ~ {'old_name', 'new_name'}. Defaults to {}.
left_labels_wrap : int, optional: Wraps the variable labels. Defaults to 50.
left_labels_width : int, optional: Changes the space allocated to the y-label. Defaults to 999 (which means automatic).
show_mean : bool, optional: Turns the mean annotations on. ~ True. Defaults to False.
title_custom_text : str, optional: Creates custom title text, if automatic text isn't desired. Defaults to "auto".
subtitle_custom_text : str, optional: Creates custom subtitle text, if automatic text isn't desired. Defaults to "auto".
tag_custom_text : str, optional: Creates custom tag text, if automatic text isn't desired. Defaults to "auto".
title_position : int, optional: Sets the position of the title text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 1.
subtitle_position : int, optional: Sets the position of the subtitle text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 2.
tag_position : int, optional: Sets the position of the tag text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 3.
return_data : bool, optional: Returns the data frame, that is used to construct the graph, so it can be viewed. The data frame will be printed out if the parameter is True. This doesn't affect the graph in any way. Defaults to False.
color_theme : str, optional: Sets color theme for graph. ~ 'test', 'gfs'. Defaults to 'standard'.
color_type : str, optional: Sets color value type. ~ 'groups', 'single_hue'. Defaults to 'diverging'.
color_direction : str, optional: Sets direction of color pattern. ~ 'backward'. Defaults to 'forward'.
value_labels_color : int, optional: Sets threshold for black or white label color. Use 0 for all black labels and 255 for all white. Use numbers between for a mix. Defaults to 130.
value_labels_show_min : float, optional: Sets threshold showing labels by %. Use 0 to show all labels and 100 to show none. Use numbers between for a mix. Defaults to 0.8.
weight : str, optional: Turns weight on if weight variable name is added. ~ "gewicht". Defaults to ''.
order_by : dict, optional: A dictionary. Orders the plot by any variable in the x data frame. ~ {'mean': 'asc'} / {1: 'desc'}. Defaults to {}.
title_remove_before : str, optional: Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
title_remove_after : str, optional: Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark that. Defaults to ''.
tag_add_before : str, optional: Adds text before the tag text. ~ "Meine Zusatzinfo, ', " adds this string before tag. Defaults to ''.
tag_add_after : str, optional: Adds text after the tag text. ~ "Meine Zusatzinfo, ', " adds this string after tag. Defaults to ''.
subtitle_add_before : str, optional: Adds text before the tag text. ~ "Filter: F3 = 'Ja', " adds this string before tag. Defaults to ''.
subtitle_add_after : str, optional: Adds text after the tag text. ~ "Filter: F3 = 'Ja', " adds this string after tag. Defaults to ''.
left_labels_remove_before : str, optional: Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
left_labels_remove_after : str, optional: Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark that. Defaults to ''.
value_labels_add_after : str, optional: Adds text to value labels. Is useful to add a % sign for example. Defaults to ''.
bar_width : float, optional: Sets the width of the bars. Defaults to 0.8.
break_distance : float, optional: Sets the distance between the breaks. Defaults to 0.5.
label_size : float, optional: Sets the size of labels. Defaults to 16.
value_labels_size : float, optional: Sets the size of value labels. Defaults to 16.
legend_labels_size : float, optional: Sets the size of legend. Defaults to 12.
show_legend : bool, optional: Turns legend on and off. ~ False. Defaults to True.
show_total : bool, optional: Turns total bar on and off. ~ False. Defaults to True.
show_count : bool, optional: Turns counts in variable labels on and off. ~ True. Defaults to False.
show_index : bool, optional: Turns index in variable labels on and off. ~ True. Defaults to False.
mean_name : str, optional: Changes name written above the mean values. Defaults to 'MW'.
mean_transform : LambdaType, optional: Changes the size of the mean values with a transformation. lambda n: -n + 2 changes n to the negative value of n and adds 2. Defaults to lambda n: n.
select_variable_levels : list[int], optional: A list variable levels used to select and order variables. ~ [2, 1, 99999997]. Defaults to [].
special_variables : list[int], optional: A list variable levels that will be treated differently. They will appear gray in the graph and have not effect on the mean calculation. ~ [16, 99999997]. Defaults to [96, 97, 98, 99999996, 99999997, 99999998].
standard_arguments : dict, optional: Adds some standard arguments that normally don't need to be changed. Defaults to None.
save_figure : bool, optional: Turns figure export on and off. ~ False. Defaults to True.
return_figure : bool, optional: Turns figure return on and off. ~ False. Defaults to True.

Returns

_type_: If return_data is on it returns the plot data, so it can be viewed.

def create_total_column(metatable: MetaTable2, name: str = 'tz') ‑> None

Creates the total column used in many standard tasks.

Args

metatable : MetaTable2: The Metatable that needs editing
name : str: name to overwrite the column name, Defaults to tz.

def create_winner_loser_break(metatable: MetaTable2, original_variable: str, new_break_name: str = 'winner_loser_break', dont_know_code: bool = False) ‑> None

Creates a standard winner/loser break.

Args

metatable : MetaTable2: The Metatable that needs editing
original_variable : str: Name of the original variable
new_break_name : str: Name of the newly created break
dont_know_code : bool: should the dont know code be recoded

def create_wordcloud(info: dict, variables: list[str], language_column: str, translate_to: int = 564, transform: str = 'no', font_name: str = 'arial', title_custom_text: str = 'auto', subtitle_custom_text: str = 'auto', tag_custom_text: str = 'auto', title_position: int = 1, subtitle_position: int = 2, tag_position: int = 3, return_data: bool = False, color_theme: str = 'auto', color_custom: str = 'auto', title_remove_before: str = '', title_remove_after: str = '', tag_add_before: str = '', tag_add_after: str = '', subtitle_add_before: str = '', subtitle_add_after: str = '', remove_words: Optional[None] = None, min_font_size: int = 0, min_count: int = 1, max_words: int = 200, translation_file_path: str = None, standard_arguments: dict = None, save_figure: bool = True, df: pandas.core.frame.DataFrame = 'auto', meta: pyreadstat._readstat_parser.metadata_container = 'auto') ‑> Optional[tuple]

generates a wordcloud plot, also translates open answers into a language

Args

info : dict: A dictionary with info about the presentation. Has info like number of slides, presentation name and the presentation object itself ~ usually info variables (list): A list of variables. If multiple are added, function doesn't accept any breaks. ~ ['AW2'] / ['AW1_1', 'AW1_2', …]
language_column : str: The column that contains the language information.
translate_to : int, optional: language code to translate to, matches nebu language code. Defaults to 564.
transform : str, optional: How to transform the words. ~ 'upper', 'lower', 'no'. Defaults to 'no'.
font_name : str, optional: font to use. Defaults to 'arial'.
title_custom_text : str, optional: Creates custom title text, if automatic text isn't desired.Defaults to 'auto'.
subtitle_custom_text : str, optional: Creates custom subtitle text, if automatic text isn't desired. Defaults to 'auto'.
tag_custom_text : str, optional: Creates custom tag text, if automatic text isn't desired. Defaults to 'auto'.
title_position : int, optional: Sets the position of the title text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 1.
subtitle_position : int, optional: Sets the position of the subtitle text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 2.
tag_position : int, optional: Sets the position of the tag text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 3.
return_data : bool, optional: Returns the data frame, that is used to construct the graph, so it can be viewed. The data frame will be printed out if the parameter is True. This doesn't affect the graph in any way. Defaults to False.
color_theme : str, optional: Sets color theme for graph. ~ 'test', 'gfs'. Defaults to 'auto'.
color_custom : str, optional: Uses custom colors ~ ['#131366', '#454578']. Defaults to None.
title_remove_before : str, optional: Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
title_remove_after : str, optional: Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark that. Defaults to ''.
tag_add_before : str, optional: Adds text before the tag text. ~ "Meine Zusatzinfo, ', " adds this string before tag. Defaults to ''.
tag_add_after : str, optional: Adds text after the tag text. ~ "Meine Zusatzinfo, ', " adds this string after tag. Defaults to ''.
subtitle_add_before : str, optional: Adds text before the tag text. ~ "Filter: F3 = 'Ja', " adds this string before tag. Defaults to ''.
subtitle_add_after : str, optional: Adds text after the tag text. ~ "Filter: F3 = 'Ja', " adds this string after tag. Defaults to ''.
remove_words : Optional[list[str]], optional: Words to remove in the wordcloud. Defaults to None.
min_font_size : int, optional: minimal font size to be displayed. Defaults to 0.
min_count : int, optional: minimum count of word frequency to be displayed. Defaults to 1.
max_words : int, optional: max word count to be displayed. Defaults to 200.
translation_file_path : str, optional: path to file to already translated texts. Defaults to None.
standard_arguments : dict, optional: Adds some standard arguments that normally don't need to be changed. Defaults to None.
save_figure : bool, optional: Turns figure export on and off. Defaults to True.
df : pd.DataFrame, optional: The current data file ~ usually df. Defaults to 'auto'.
meta : pyreadstat._readstat_parser.metadata_container, optional: The current meta data file ~ usually meta. Defaults to 'auto'.

Raises

errors: description

Returns

Optional[tuple]: description

def get_answer_list(df, column: str) ‑> list

Get a list of answers from a column in a data frame.

Args

df : pd.DataFrame: The data frame.
column : str: The column name.

Returns

list: A list of answers.

def get_word_frequencies(german_answers: list[str], transform: str, remove_words: Optional[None] = None, min_count: int = 1) ‑> dict

Get word frequencies from a list of german answers.

Args

german_answers : list: A list of german answers.
transform : str: How to transform the words. ~ 'upper', 'lower', 'no'
remove_words : list, optional: A list of words to remove. Defaults to None.
min_count : int, optional: The minimum count of a word to be included. Defaults to 1.

def load_and_render_panel_email(invitation_reason: str, end_of_study: str, study_code: str, length_min: float, points: Union[float, int] = 'auto', template_html: str = 'Vorlage_Mail_EinladungUmfrage_dt.html', output_file_path: str = 'Vorlage_Mail_EinladungUmfrage_dt.html', input_file_path: str = './templates/polittrends_invitation_templates/')

creates a panel invitation email by adding the invitationreason, the studynr and the length of the study to the panel templates, Points and CHF values are calculated from the length of the study, other languages can be chosen y using another template

Args

invitation_reason : str: Reason for Invitation
end_of_study : str: End of Study, as a string but should be a date (eg. 16. November 2023)
study_code : str: Code from Manageframes (eg: 999923528)
length_min : float: length of the study in minutes
points : Union[float, int]: points for the study, if 'auto' points are calculated from length_min
template_html : str: template file name
output_file_path : str: file name to be saved
input_file_path : str: folder for templates

def print_word_frequencies(df, column, min_count)

Prints word frequencies for a given column.

Args

df : pd.DataFrame: The data frame.
column : str: The column name.
min_count : int: The minimum count of a word to be included.

def save_presentation(info: dict) ‑> None

Saves the presentation

Args

info : dict: The presentation object

def show_stats(meta, variable)

def show_summary(meta, columns: list = [])

def start_presentation(name: str, df: pandas.core.frame.DataFrame, meta: pyreadstat._readstat_parser.metadata_container, output_path: str = './output/', template_path: str = './templates/', template_custom: str = 'auto', color_theme: str = 'standard', language: str = 'de', value_labels_color: int = 130, weight: str = '', line_width_linechart: int = 4, line_marker_size_linechart: int = 8, bar_width_barchart: float = 0.7, bar_width_stacked_barchart: float = 0.8, bar_gap_barchart: float = 0.15, break_distance: float = 0.5, count_name: str = 'N', logo_side: str = 'right', project_leader: str = None, project_people: list[str] = None, export_svg: bool = False, standard_arguments: dict = None, show_page_number: bool = False, use_wide_screen: bool = False) ‑> dict

Creates a new presentation

Args

name : str: Changes the name of the presentation
df : pd.DataFrame: The current data file ~ usually df
meta : pyreadstat._readstat_parser.metadata_container: The current meta data file ~ usually meta
output_path : str, optional: Changes path and name of the output files. Defaults to "./output/".
template_path : str, optional: Sets path of pptx-template file. Defaults to './templates/'.
template_custom : str, optional: Choose a custom template. Defaults to 'auto'.
color_theme : str, optional: Sets color theme for graph. ~ 'test', 'gfs'. Defaults to 'standard'.
language : str, optional: Sets language for texts like subtitle and tag. ~ 'fr', 'en', 'it'. Defaults to 'de'.
value_labels_color : int, optional: Sets threshold for black or white label color. Use 0 for all black labels and 255 for all white. Use numbers between for a mix. Defaults to 130.
weight : str, optional: Turns weight on if weight variable name is added. ~ "gewicht". Defaults to ''.
bar_width_barchart : float, optional: Sets the width of the barchart bars. Defaults to 0.7.
bar_width_stacked_barchart : float, optional: Sets the width of the stacked barchart bars. Defaults to 0.8.
bar_gap_barchart : float, optional: Sets the gap between the bars within a group. Defaults to 0.15.
break_distance : float, optional: Sets the distance between the breaks. Defaults to 0.5.
count_name : str, optional: Changes the name of the count. ~ if "K" is passed in, counts look like "K = 42". Defaults to 'N'.
logo_side : str, optional: Changes the position of the logo on the presentation slides. ~ 'left'. Defaults to 'right'.
project_leader : str: The first name of the project leader. Can be "Andreas" for example. Defaults to None.
project_people : list[str], optional: A list of the names of the people working on the project. For example ["Andreas", "Nadia"]. Defaults to None.
export_svg : bool, optional: Exports graphs as .svg files if True. Defaults to False.
standard_arguments : dict, optional: Adds some standard arguments that normally don't need to be changed. Defaults to {}.
show_page_number : bool, optional: Turns page number on PowerPoint on and off ~ True. Defaults to False.
use_wide_screen : bool, optional: Turns slides into 16:9 widescreen format ~ True. Defaults to False.

Returns

dict: Returns a dictionary with presentation info, the presentation object and some info about the plot theme.

def translate_column(info: dict, open_question_column: str, language_column: str, translate_to: int, df: pandas.core.frame.DataFrame, translation_file_path: str) ‑> None

Translates a column to a different language. Saves the translations to a JSON-file.

Args

info : dict: A dictionary with info about the presentation.
open_question_column : str: The column that should be translated.
language_column : str: The column that contains the language information.
translate_to : int: The language code to translate to.
df : pd.DataFrame: The data frame that contains the columns.
translation_file_path : str: file path to take the saved translations from.

Classes

class MatrixMixer (meta_table: MetaTable2, breaks: list = [['tz']], language: str = 'de', weight: str = 'tz', special_variables_last: bool = True, study_title: str = 'Studie', significance_level: float = 0.05, use_conf_interval: bool = False, conf_interval: float = 0.95, column_width: int = 15, hide_total_break: bool = False)

Assembles multiple tables with descriptive statistics and exports them to excel.

Args

meta_table : MetaTable2: MetaTable created using gfs_functions
breaks : list[str]: Selects the breaks used by default. Defaults to ['tz'].
language : str: Selects the language of the Matrix. Defaults to 'de'.
weight : str: Turns weight on if weight variable name is added. "gewicht", defaults to 'tz'
special_variables_last : bool: If True, special variables will be shown at the bottom. Defaults to True.
study_title : str: Title of the study. Defaults to 'Studie'.
use_conf_interval : bool: Show confidence intervals. Defaults to False.
conf_interval : float: change niveau of confidence interval. Defaults to .95.
column_width : int: Column width in Excel. Defaults to 15.
hide_total_break : bool: hides the "C" column, which is normally the "Total" break. Defaults to False.

Method generated by attrs for class MatrixMixer.

Class variables

var breaks : list
var column_width : int
var conf_interval : float
var hide_total_break : bool
var language : str
var meta_table : MetaTable2
var significance_level : float
var special_variables_last : bool
var study_title : str
var use_conf_interval : bool
var weight : str

Methods

def add_title_page_info(self, study_title: str, clients: list[str], survey_methodology: str, sampling: str, quota_features: str, address_origin: list[str], population: str, weighting: str, survey_period: str, random_sample: str = None)

sets title page info

Args

study_title : str: title of study
clients : list[str]: list of clients
survey_methodology : str: method used in study
sampling : str: sampling teqnique used
quota_features : str: feautures to quota was set on
address_origin : list[str]: origin of adresses
population : str: population of study
weighting : str: weighting variables or factors
survey_period : str: period in which the survey took place
random_sample : str: used to overwrite the random sample string, default is n = (number of recipients)

def create_matrix(self, question: str, breaks: list[list[str]] = 'auto', weight: str = 'auto', order_descending: Union[str, bool] = 'auto', special_variables: Union[str, list[int]] = 'auto', nps: bool = False, groupings: dict = None) ‑> None

Creates a single matrix and saves it to the MatrixMixer object.

Args

question : str: Name of question. Can either be of type Column or Group.
breaks : list[list[str]], optional: Selects the breaks used in matrix. Defaults to 'auto'.
weight : str, optional: Selects the weight used in matrix. Defaults to 'auto'.
order_descending : Union[str, bool], optional: Orders the values descending or not. Use either False or True. Defaults to 'auto'.
special_variables : Union[str, list[int]], optional: List of special variables which will appear at the bottom. Could be something like [96, 97, 98, 99999997, 99999998]. Defaults to 'auto'.
nps : bool, optional: add nps to the matrix. Defaults to False.
groupings : dict, optional: groupings dictionary to create variables boxes. an example is {"Top-Box (ja, eher ja)":[1, 2]}. Defaults to None.

def export_excel(self, show_title_page: bool = True, show_combined_sheet: bool = True, show_percentage_sheet: bool = True, show_absolute_sheet: bool = False, show_debugging_sheet: bool = False, table_name: str = None) ‑> None

Exports the matrices from the MatrixMixer to Excel.

Args

show_title_page : bool, optional: show a title page. Defaults to True.
show_combined_sheet : bool, optional: shows sheet with combined values. Defaults to True.
show_percentage_sheet : bool, optional: shows sheet with percentage values. Defaults to True.
show_absolute_sheet : bool, optional: shows sheet with absolute values. Defaults to False.
show_debugging_sheet : bool, optional: shows all sheets for debugging purposes. Defaults to False.
table_name : str, optional: Name of the exported table file. Defaults to None.

class MetaTable (df: pandas.core.frame.DataFrame, meta: pyreadstat._readstat_parser.metadata_container, project_name: str, year: str = '2023')

creates MetaTable object that makes data wrangling of files read with pyreadstat easy

Args

df : pd.DataFrame: DataFrame read with pyreadstat
meta : pyreadstat._readstat_parser.metadata_container: pyreadstat metadata
project_name : str: The name of the project. Is used as a folder name for the generated files
gfs_meta : dict: gfs specific meta data that is used in other gfs products

Methods

def create_column_copy(self, new_column: str, column: str, copy_values: bool)

Creates a new column that's a copy of another column in the MetaTable

Args

new_column : str: Name of the new column
column : str: Name of the column to be copied
copy_values : bool: If True it copies the dataframe values, if False, the column will be full of np.NaN

def create_empty_columns(self, columns: list[str], label: str, variable_labels: dict[int, str])

def export_config(self)

Exports an excel-file that makes changing the meta data very simple.

def import_config(self)

Imports the excel-file with the changed meta data and changes the MetaTable accordingly.

def recode(self, columns: Union[list[str], dict[str, str]], values: dict[int, typing.Any], keep_untouched_codes=True)

recodes chosen columns

Args

columns : Union[list[str], dict[str, str]]: can either be a list of the columns that should get new codes list ['F5_01', 'F5_02'] or a dictionary with columns that should be kept unchanged as keys and new columns as values like {'F5_01': 'F5_01_rec', 'F5_02': 'F5_02_rec'}
values : dict[int, Any]: a dictionary with new variable labels as keys and variable labels that should be recoded as values like {1: [1, 2], 2: range(3, 6), 3: 6}. This means that the old codes 1 and 2 become the new code 1, the codes 3, 4 and 5 become the new code 2, and the code 6 becomes the new code 3.
keep_untouched_codes : bool, optional: For example: When a code 7 exists, but is not changed with the "values" argument, it is kept as it was if True. If False, all answers with code 7 will set to Null and the label for the code 7 will be deleted. Defaults to True.

def rename_columns(self, columns: dict[str, str])

Renames the given columns.

Args

columns : dict[str, str]: Columns to rename: {'old_name': 'new_name'}

def return_components(self) ‑> tuple[pandas.core.frame.DataFrame, pyreadstat._readstat_parser.metadata_container, dict]

Returns the updated DataFrame, the metadata that are contained within the object and the gfs metadata

Returns

tuple[pd.DataFrame, pyreadstat._readstat_parser.metadata_container, dict]: returns tuple with objects

def scale_level(self, columns: list[str], new_scale: str)

Changes the scale levels of the given columns.

Args

columns : list[str]: A list of columns
new_scale : str: The new scale of the given columns. Can be 'nominal', 'scale' or 'ordinal'

def select_columns(self, columns: list)

selects the given columns and removes all others.

Args

columns : list: Columns to select

def show(self, columns: Union[list[str], str] = 'last changed', only_label: bool = False, show_objects: bool = False, total: bool = False)

Shows info about the value labels and variable label of the given columns.

Args

columns : Union[list[str], str]: Columns to be shown. If no column is passed in the last changed columns will be used. columns can either be a list of column names or a single column name as a string.
only_label : bool: Only shows the variable label
show_objects : bool: Also shows the python objects (makes copying easier)
total : bool: makes some changes to output for the show_all function. Not useful to the enduser.

def show_all(self, columns: Union[list[str], str] = 'last changed')

Shows info about the value labels and the meta data of the given columns.

Args

columns : Union[list[str], str]: Columns to be shown. If no column is passed in the last changed columns will be used. columns can either be a list of column names or a single column name as a string.

def show_meta(self, columns: Union[list[str], str] = 'last changed', total: bool = False)

Shows info about the meta data of the given columns.

Args

columns : Union[list[str], str]: Columns to be shown. If no column is passed in the last changed columns will be used. columns can either be a list of column names or a single column name as a string.
total : bool: makes some changes to output for the show_all function. Not useful to the enduser.

def val_lab(self, columns: list[str], labels: Union[dict[int, str], str], keep_untouched_codes=False)

Changes the value labels of the given columns

Args

columns : list[str]: A list of columns that need new value labels
labels : Union[dict[int, str], str]: A dictionary with new labels {1 : "label for code 1", 2: "label for code 2"} or the variable name with the labels to be used "variable_name"
keep_untouched_codes : bool, optional: This will keep the old labels of the column and just add the new ones instead of replacing all labels. Defaults to False.

def var_lab(self, columns: list[str], text: str)

Changes the variable labels of the given columns

Args

columns : list[str]: A list of columns that need a new variable label
text : str: Text of the variable label

def write_sav(self, path: str)

Writes a sav file of the MetaTable object which includes a Dataframe and the metadata.

Args

path : str: Path of the new file, includes the file name. To save the file in the package directory use "./filename.sav"

class MetaTable2 (df: pandas.core.frame.DataFrame, meta: pyreadstat._readstat_parser.metadata_container, project_number: int, project_name: str, columns: dict[str, Column] = _Nothing.NOTHING, groups: dict[str, Group] = _Nothing.NOTHING, weights: dict[str, Weight] = _Nothing.NOTHING, year: str = '2025', output_path: str = './output/', projects_path: str = './gfs_projects/')

Creates MetaTable object that makes data wrangling of files read with pyreadstat easy.

Args

df : pd.DataFrame: DataFrame read with pyreadstat
meta : pyreadstat._readstat_parser.metadata_container: pyreadstat metadata
project_number : int: The project number. Is used as a folder name for the generated files
project_name : str: The name of the project. Is used as a folder name for the generated files
columns : dict[str, Column]: metadata of all dataframe columns. Is part of gfs-meta
groups : dict[str, Group]: metadata of all column groups. Is part of gfs-meta
year : str: current year. Is used to choose the save folder
output_path : str, optional: Path of the output. Defaults to './output/'.
projects_path : str, optional: Path of the git projects folder. Defaults to './gfs_projects/'.

Method generated by attrs for class MetaTable2.

Class variables

var columns : dict[str, Column]
var df : pandas.core.frame.DataFrame
var groups : dict[str, Group]
var meta : pyreadstat._readstat_parser.metadata_container
var output_path : str
var project_name : str
var project_number : int
var projects_path : str
var weights : dict[str, Weight]
var year : str

Methods

def add_column_to_group(self, column: str, group: str)

Adds column to a group.

Args

column : str: Name of the column
group : str: Group to add the column

def add_to_group(self, column: str)

Adds info to the group that the column belongs to it.

Args

column : str: Name of the column

def calculate_weight(self, column: str, target_values: dict, return_df: bool = False)

Calculates the weights for a given weight column based on the target values. You need to create the weight before you can calculate it.

Args

column : str: Name of the weight column
target_values : dict: Dictionary with the target values for each intersection. This is a nested dictionary and it accepts absolute values. Can be like {1: {1: 20, 2: 25}, 2: {1: 40, 2: 50}}
return_df : bool: If True it will return the weighted dataframe. Defaults to False.

def check_duplicates(self, row: ) ‑> bool

function to check for duplicates in a row

Args

row : np.array: row to apply the function to

Returns

bool: if duplicates exist returns True, else False

def check_missing_columns(self, expected_columns: list)

def check_sav_prebreak(self, method: str = 'CATI', Interviewer_Column: str = 'ENQ2', Interview_Duration_Column: str = 'DURINT', Date_Column: str = 'DATE') ‑> None

checks and prints multiple key features of a sav file from nebu

Args

method : str, optional: string to indicate the method. Defaults to "CATI". possible values are "CATI" or "OTHER"
Interviewer_Column : str, optional: Interviewer Code Column. Defaults to "ENQ2".
Interview_Duration_Column : str, optional: Interview Duration Column. Defaults to "DURINT".
Date_Column : str, optional: Interview Date Column. Defaults to "DATE".

def copy_column(self, old_column: str, new_column: str, same_group: bool = True, add_to_group: bool = True)

Creates a copy of a column and gives it a new name.

Args

old_column : str: Name of the column to be copied
new_column : str: Name of the new column
same_group : bool, optional: If True the column will be added to the same group as the copied column. Defaults to True.

def copy_group(self, old_group: str, new_group: str)

Creates a copy of a column group and gives it a new name.

Args

old_group : str: Name of the group to be copied
new_group : str: Name of the new group

def create_column(self, column: str, label: str, value_labels: Union[dict[int, str], str, ForwardRef(None)] = None, measure: str = 'scale')

Creates a new column in the MetaTable and the DataFrame.

Args

column : str: Name of the column
label : str: Label of the column
value_labels : Union[dict[int, str], str], optional: Value labels of the column. Defaults to None.
measure : str, optional: Measure of the column. Defaults to 'scale'.

def create_group(self, group_name: str, columns: list[str], kind: str, measure: str = 'auto', lfm: str = 'yes', mean: str = 'auto', group_label: str = 'auto', group_value_labels: Union[dict[int, str], str] = 'auto', missing_values: Union[list[float], str] = 'auto')

Creates a new group in the MetaTable.

Args

group_name : str: Name of the group
columns : list[str]: List of columns that belong to the group
kind : str: Kind of the group. Can be 'multi' or 'batch'
measure : str, optional: Measure of the group. Can be 'string', 'nominal', 'scale' or 'ordinal'. Defaults to 'auto'.
lfm : str, optional: Decides if the value labels of to group should be used for all columns in the group. Can be 'yes' or 'no'. Defaults to 'yes'.
mean : str, optional: Decides if there is a useful mean value for a group of columns. Can be 'yes' or 'no'. Defaults to 'auto'.
group_label : str, optional: Label of the group. Defaults to 'auto'.
group_value_labels : Union[dict[int, str], str], optional: Value labels of the group. Defaults to 'auto'.
missing_values : Union[list[float], str], optional: List of missing values of the group. Defaults to 'auto'.

def create_weight(self, name: str, columns: list[str])

Creates a new weight based on the given columns.

Args

name : str: Name of the new weight
columns : list[str]: List of columns that should be used to calculate the weight.

def delete_column(self, column: str)

Deletes a column and the information about it from its group.

Args

column : str: Name of the column

def delete_group(self, group: str)

Deletes a column group and the information about it in every column.

Args

group : str: Name of the group

def encode(self, old_column: str, new_column: str, values: Optional[dict[str, int]] = None)

Encodes a column based on a dictionary with the new values.

Args

old_column : str: Name of the column
new_column : str: Name of the new column
values : Optional[dict[str, int]]: A dictionary with the new values and the old values that should be replaced. Can be {"yes": 1, "no": 2}. This will replace all values from "yes" to 1 and "no" to 2.

def export_coding_excel(self, column_lists: list[list[str]], filename: str = 'toCode', darker_columns: list = None, use_value_labels: bool = False) ‑> None

Exports a .xlsx-file with the given columns and their value labels. This is mostly used for coding open questions.

Args

column_lists : list[list[str]]: List of lists with the columns that should be exported. Can be [['CODERESP'], ['F1@', 'F1_01', 'F1_02', 'F1_03']]. All columns of every sublist will have the same background color.
filename : str: File name of the .xlsx file. Defaults to 'toCode'.
darker_columns : list: List of columns that should have a darker background color. Defaults to None.
use_value_labels : bool: If True it will display the value labels instead of the codes. Defaults to False.

def export_config(self, export_df: bool = True, gfs_config_name: str = 'gfs-config')

Exports an excel-file that makes changing the meta data very simple.

Args

export_df : bool: If True it will also export the data. Defaults to True.

def export_data(self, file_name: str = 'fertig') ‑> None

Exports a .SAV-file and a gfs-meta JSON-file.

Args

file_name : str: File name of the .sav file

def filter_label(self, column: str, filter_label: str)

Updates the filter_label of a column.

Args

column : str: Name of the column
filter_label : str: New filter_label of the column. This label adds information about the filter that was used for that question in the questionnaire.

def get_intersection_counts(self, categorical_columns: list[str])

Get the count for all combinations of the given columns.

Args

categorical_columns : list[str]: List of columns that should be used to calculate the intersection counts.

def group_filter_label(self, group: str, filter_label: str)

Updates the filter_label of a group of columns.

Args

group : str: Name of the group
filter_label : str: New filter_label of the group. This label adds information about the filter that was used for that question in the questionnaire.

def group_has_mean(self, group: str, mean: str)

Updates if there is a useful mean value for a group of columns.

Args

group : str: Name of the group
mean : str: New mean state of the group. Should be "yes" or "no"

def group_kind(self, group: str, kind: str)

Updates the kind of a group of columns.

Args

group : str: Name of the group
kind : str: New kind of the group. Should be "multi", "single" or "batch"

def group_label(self, group: str, text: str, verbose: bool = True)

Updates the group_label of a group of kind = "batch" or "multi".

Args

group : str: Name of the group of columns
text : str: New text of the group_label
verbose : bool: Prints warnings if True. Defaults to True.

def group_lfm(self, group: str, lfm: str)

Updates the lfm (label from group) of a group of columns.

Args

group : str: Name of the group
lfm : str: New lfm of the group. Should be "yes" or "no"

def group_measure(self, group: str, measure: str)

Updates the measure of a group of columns.

Args

group : str: Name of the group
measure : str: New measure of the group. Should be "nominal", "string", "scale" or "ordinal"

def group_missing_values(self, group: str, missing_values: list[float])

Updates the missing values of a group of columns.

Args

group : str: Name of the group
missing_values : list[float]: New missing values of the group.

def group_value_labels(self, group: str, value_labels: Union[dict[int, str], str], keep_untouched_codes: bool = False)

Updates the value labels of a group of columns.

Args

group : str: Name of the group
value_labels : Union[dict[int, str], str]: A dictionary with new labels {1 : "label for code 1", 2: "label for code 2"} or the column name with the labels to be used "column_name"
keep_untouched_codes : bool, optional: This will keep the old labels of the column and just add the new ones instead of replacing all labels. Defaults to False.

def has_mean(self, column: str, mean: str)

Updates if there is a useful mean value for a column.

Args

column : str: Name of the column
mean : str: New mean state of the group. Should be "yes" or "no"

def import_config(self, gfs_config_name: str = 'gfs-config') ‑> None

Imports the gfs-config excel-file and updates the MetaTable according to the changes made in excel.

Args

gfs_config_name : str, optional: name of config if it should not be default or multiple configs are used. Defaults to "gfs-config".

Raises

FileNotFoundError: if a config file is not found
ValueError: description

def item_label(self, column: str, text: str, verbose: bool = True)

Updates the item_label of a variable of kind = "batch".

Args

column : str: Name of the column
text : str: New text of the item_label
verbose : bool: Prints warnings if True. Defaults to True.

def kind(self, column: str, kind: str)

Updates the kind of a column.

Args

column : str: Name of the column
kind : str: New kind of the column. Should be "multi", "single" or "batch"

def make_quota_check(self, columns: list[str], filename_quotas: str = 'cross_tab', filename_quota_check: str = 'quota_check', calc_quota_difference: bool = False, save_quota_check: bool = False)

Calculates the difference between a crosstab and a quota

Args

columns : list[str]: list of dataframe columns in crosstab
filename_quotas : str, optional: name of the excel file where the crosstab is
filename_quota_check : str, optional: name of the excel file where the difference in quotas is saved
calc_quota_difference : bool, optional: boolean to indicate if difference in quota is calculated
save_quota_check : bool, optional: boolean to indicate if difference in quota is saved in an excel file

def measure(self, column: str, measure: str)

Updates the measure of a column.

Args

column : str: Name of the column
measure : str: New measure of the column. Should be "nominal", "string", "scale" or "ordinal"

def merge_open_questions(self, df_open_questions: pandas.core.frame.DataFrame, columns: list, code_list: dict, group_name: str, group_label: str = '', merge_Id: str = 'CODERESP', group_kind='multi', measure: str = 'auto', check_for_duplicates: bool = True) ‑> None

merges open questions with the metatable dataframe

Args

df_open_questions : pd.DataFrame: open question dataframe
columns : list: list of columns to merge (normally a group)
code_list : dict: dictionary with the new code list (used for group value labels)
group_name : str: group name to use
group_label : str, optional: Label for the group. Defaults to "".
merge_Id : str: id to merge columns on, defaults to CODERESP
group_kind : str, optional: kind of group. Defaults to 'multi'.
measure : str, optional: measure of the group. Defaults to 'auto'.
check_for_duplicates : True, optional: check duplicates overrule parameter, duplicates are not checked if set to False. Defaults to 'True'.

def merge_semiopen_questions(self, df_semiopen_questions: pandas.core.frame.DataFrame, columns: list, code_list: dict, group_name: str, merge_Id: str = 'CODERESP', check_for_duplicates: bool = True) ‑> None

merges semi open questions with the metatable dataframe

Args

df_semiopen_questions : pd.DataFrame: semiopen questions dataframe
columns : list: list of columns to merge (normally a group)
code_list : dict: dictionary with the new code list (used for group value labels)
group_name : str: group name to use
merge_Id : str: id to merge columns on, defaults to CODERESP
check_for_duplicates : True, optional: check duplicates overrule parameter, duplicates are not checked if set to False. Defaults to 'True'.

def missing_values(self, column: str, missing_values: list[float])

Updates the missing values of a column.

Args

column : str: Name of the column
missing_values : list[float]: New missing values of the column.

def move_column(self, column: str, end: bool = True)

Moves a column to the beginning or the end of the MetaTable

Args

column : str: Column to be moved
end : bool, optional: If end is True the column is moved to the end, if end is False the column is moved to the beginning. Defaults to True.

def move_columns(self, column_order: list)

Moves columns based on the desired order in the MetaTable.

Args

column_order : list: The desired column order.

def randomise_divers_gender(self, gender_column='S11', divers_values: list = [3], seed: int = 12345)

randomises the divers gender value to either 1 or 2 with a change of 50/50, asserts that 1 and 2 are male and female values

Args

gender_column : str, optional: column name which has the values for gender. Defaults to "S11".
divers_values (list(int), optional): values which equals to divers labels, if multiple are given the \
randomisation is executed for each label sequentially. Defaults to 3.
seed : int, optional: randomised seed, should normally not be changed. Defaults to 12345.

def recode(self, old_column: str, new_column: str, values: dict[int, typing.Any], keep_untouched_codes: bool = True)

Recodes a column based on a dictionary with the new values.

Args

old_column : str: Name of the column
new_column : str: Name of the new column
values : dict[int, Any]: A dictionary with the new values and the old values that should be replaced Can be {1: range(1, 20), 2: [20, 21], 3: 22}. This will replace all values from 1 to 19 with 1, 20 and 21 with 2 and 22 with 3.
keep_untouched_codes : bool: This will keep the old labels of the column and just add the new ones instead of replacing all labels. Defaults to True.

def recode_group(self, old_group: str, new_group: str, values: dict[int, typing.Any], keep_untouched_codes: bool = True)

Recodes a group of columns based on a dictionary with the new values.

Args

old_group : str: Name of the group
new_group : str: Name of the new group
values : dict[int, Any]: A dictionary with the new values and the old values that should be replaced Can be {1: range(1, 20), 2: [20, 21], 3: 22}. This will replace all values from 1 to 19 with 1, 20 and 21 with 2 and 22 with 3.
keep_untouched_codes : bool: This will keep the old labels of the column and just add the new ones instead of replacing all labels. Defaults to True.

def remove_from_group(self, column: str)

Removes a column from a column group.

Args

column : str: Name of the column

def remove_speeders(self, speeder_value: float = None, Interview_Duration_Column: str = 'DURINT') ‑> None

Remove speeder rows from the DataFrame where interview duration is below the calculated speeder threshold.

Args

speeder_value : float: The precalculated speeder threshold value. If not provided, it will be calculated using _calculate_speeder_value.
Interview_Duration_Column : str: Name of the column containing interview durations. Default is "DURINT".

def rename_column(self, name: str, new_name: str)

Renames a column.

Args

name : str: Old name of the column
new_name : str: New name of the column

def rename_group(self, group: str, new_group_name: str)

renames a group

Args

group : str: group to rename
new_group_name : str: new group name

def select_columns(self, columns: list)

Selects columns and removes the others

Args

columns : list: Names of the columns to select

def show_column_info(self, column: str, show_objects: bool = False)

Shows info about the value labels and variable label of the given column.

Args

column : str: Column to be shown
show_objects : bool: if True it prints lists value_labels

def show_column_meta(self, column: str)

Shows info about the meta data of the given column.

Args

column : str: Column to be shown

def show_crosstab(self, columns: list[str], save_crosstab: bool = False, cross_tab_name: str = 'cross_tab', drop_na: bool = False, show_margins: bool = True)

Creates and shows a crosstab with a set of row and a set of column breaks

Args

columns : list[str]: list of dataframe columns in crosstab
save_crosstab : bool, optional: boolean to indicate if crosstab is saved in an excel file
cross_tab_name : str, optional: name of the excel file
drop_na : bool: if True it doesn't show rows and columns if all of their values are zero
show_margins : bool: Shows the total of rows and columns if True

def show_group_info(self, group: str)

Shows info about the given group.

Args

group : str: The name of a group of columns

def show_group_meta(self, group: str)

Shows info about the meta data of the given group.

Args

group : str: The name of a group of columns

def single_label(self, column: str, text: str, verbose: bool = True)

Updates the label of a variable of kind = "single".

Args

column : str: Name of the column
text : str: New text of the label
verbose : bool: Prints warnings if True. Defaults to True.

def value_labels(self, column: str, value_labels: Union[dict[int, str], str], keep_untouched_codes: bool = False)

Updates the value labels of a column.

Args

column : str: Name of the column
value_labels : Union[dict[int, str], str]: A dictionary with new labels {1 : "label for code 1", 2: "label for code 2"} or the column name with the labels to be used "column_name"
keep_untouched_codes : bool, optional: This will keep the old labels of the column and just add the new ones instead of replacing all labels. Defaults to False.

class gfs_Maps (info: dict, variables: list, map_scope: Map_Scope_Enum, map_level: Map_LOD_Enum, breaks: list = None, exclusion_filter=[], subplot_columns: int = 3, language: str = 'de', disable_hover_info: bool = False, background_map_color: str = '#999', projection: str = 'mercator', show_mean: bool = False, show_count: bool = False, show_total: bool = False, show_legend: bool = True, legend_labels_rename: dict[str, str] = None, left_labels_rename: dict[str, str] = None, break_labels_rename: dict[str, str] = None, legend_break: list = None, order_by: dict = None, special_variables: list[int] = [96, 97, 98, 99999996, 99999997, 99999998], standard_arguments: dict = None, title_custom_text: str = 'auto', subtitle_custom_text: str = 'auto', tag_custom_text: str = 'auto', legend_title_custom_text: str = 'auto', title_position: int = 1, subtitle_position: int = 2, color_theme: str = 'auto', color_type: str = 'single_hue', color_direction: str = 'backward', color_custom: list[str] = None, legend_position: str = 'right', legend_labels_size: float = 14, weight: str = 'auto', select_variable_levels: list[int] = None, tag_position: int = 3, left_labels_wrap: int = 50, colorbar_length: float = 0.7, colorbar_x: float = 0.9, colorbar_y: float = 0.5, tag_add_before: str = '', tag_add_after: str = '', subtitle_add_before: str = '', subtitle_add_after: str = '', left_labels_remove_before: str = '', left_labels_remove_after: str = '', title_remove_before: str = '', title_remove_after: str = '', label_size: int = 16, select_min_count: int = 0, save_figure: bool = True, df: pandas.core.frame.DataFrame = 'auto', meta: pyreadstat._readstat_parser.metadata_container = 'auto')

Creates a map chart

Args

info : dict: A dictionary with info about the presentation. Has info like number of slides, presentation name and the presentation object itself ~ usually info
variables : list: A list of variables. If multiple are added, function doesn't accept any breaks. ~ ['AW2'] /['AW1_1', 'AW1_2', …]
map_scope : Map_Scope_Enum: defines a preset for the scope of the map eg. EU with all its countries etc.
map_level : Map_LOD_Enum: defines level of detail for the map. eg. metropolitan areas, country or others
breaks : list, optional: A list of breaks. ~ ['alter_break', 'sex_break']. Defaults to None.
exclusion_filter : list: a list of string objects representing NUTS_ID's to exlude in the maps, example: ["CH025", "CH022", "CH024"], see: https://ec.europa.eu/eurostat/web/nuts/nuts-maps for the available ID's
subplot_columns : int: count of subplotcolumns to use if there are multiple maps
language : str: language to label the shapes, must be present in translation json
disable_hover_info : bool: disables hover info on shapes
background_map_color : str: hex color code to set the background color of map shapes, default is "#999"
projection : str: projection type as a string or None ('equirectangular', 'mercator', 'orthographic', 'natural earth', 'kavrayskiy7', 'miller', 'robinson', 'eckert4', 'azimuthal equal area', 'azimuthal equidistant', 'conic equal area', 'conic conformal', 'conic equidistant', 'gnomonic', 'stereographic', 'mollweide', 'hammer', 'transverse mercator', 'albers usa', 'winkel tripel', 'aitoff' and 'sinusoidal')
show_mean : bool, optional: Turns uses the mean of the variable as x-value and changes scale type from percentage to mean. ~ True. Defaults to False.
show_count : bool, optional: Turns counts in variable labels on and off. ~ True. Defaults to False.
show_total : bool, optional: Turns total bar on and off. ~ False. Defaults to True.
show_legend : bool, optional: Turns legend on and off. ~ False. Defaults to True.
legend_labels_rename : dict[str, str], optional: Renames the labels of the different legend levels. ~ {'old_name', 'new_name'}. Defaults to None.
left_labels_rename : dict[str, str], optional: Renames the labels of the different break levels. ~ {'old_name', 'new_name'}. Defaults to None.
break_labels_rename : dict[str, str], optional: Renames the breaks. ~ {'old_name': 'new_name'}. Defaults to None.
legend_break : list, optional: A list of breaks. This is used to create the legend break if either "mean" == True or "select_variable_levels" has exactly one item ~ ['alter_break'] / [['alter_break']]. Defaults to None.
order_by : dict, optional: A dictionary. Orders the plot by any variable in the x data frame. ~ {'mean': 'asc'} / {1: 'desc'}. Defaults to None.
special_variables : list[int], optional: A list variable levels that will be treated differently. They will appear gray in the graph and have not effect on the mean calculation. ~ [16, 99999997]. Defaults to [96, 97, 98, 99999996, 99999997, 99999998].
standard_arguments : dict, optional: Adds some standard arguments that normally don't need to be changed. Defaults to None.
title_custom_text : str, optional: Creates custom title text, if automatic text isn't desired. Defaults to 'auto'.
subtitle_custom_text : str, optional: Creates custom subtitle text, if automatic text isn't desired. Defaults to 'auto'.
tag_custom_text : str, optional: Creates custom tag text, if automatic text isn't desired. Defaults to 'auto'. legend_title_custom_text (str, optional): Creates a custom legend title, if title is desired. Defaults to 'auto'.
legend_title_custom_text : str, optional: Creates a custom legend title, if title is desired. Defaults to 'auto'.
title_position : int, optional: Sets the position of the title text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 1.
subtitle_position : int, optional: Sets the position of the subtitle text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 2.
color_theme : str, optional: Sets color theme for graph. ~ 'test', 'gfs'. Defaults to 'auto'.
color_type : str, optional: Sets color value type. ~ 'diverging', 'single_hue'. Defaults to 'groups'.
color_direction : str, optional: Sets direction of color pattern. ~ 'backward'. Defaults to 'forward'.
color_custom : list[str], optional: Uses custom colors ~ ['#131366', '#454578']. Defaults to None. value_labels_show_min (float, optional): Sets threshold showing labels by %. Use 0 to show all labels and 100 to show none. Use numbers between for a mix. Defaults to 0.
legend_position : str, optional: Sets the position of the legend. ~ 'bottom'. Defaults to 'right'.
legend_labels_size : float, optional: Sets the size of legend. Defaults to 14.
weight : str, optional: Turns weight on if weight variable name is added. ~ "gewicht". Defaults to 'auto'.
select_variable_levels : list[int], optional: A list variable levels used to select and order variables. ~ [2, 1, 99999997]. Defaults to None.
tag_add_before : str, optional: Adds text before the tag text. ~ "Meine Zusatzinfo, ', " adds this string before tag. Defaults to ''.
tag_add_after : str, optional: Adds text after the tag text. ~ "Meine Zusatzinfo, ', " adds this string after tag. Defaults to ''.
tag_position : int, optional: Sets the position of the tag text. ~ 0 = don't show, 1, 2, 3 are possible positions on plot. Defaults to 3.
left_labels_wrap : int, optional: Wraps the variable labels. Defaults to 50.
colorbar_length : float, optional: scales the colorbar length. Defaults to 0.7.
colorbar_x : float, optional: changes the x coordinate of the colorbar. Defaults to 0.9.
colorbar_y : float, optional: changes the y coordinate of the colorbar. Defaults to 0.5.
subtitle_add_before : str, optional: Adds text before the tag text. ~ "Filter: F3 = 'Ja', " adds this string before tag. Defaults to ''.
subtitle_add_after : str, optional: Adds text after the tag text. ~ "Filter: F3 = 'Ja', " adds this string after tag. Defaults to ''.
left_labels_remove_before : str, optional: Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
left_labels_remove_after : str, optional: Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark. Defaults to ''.
title_remove_before : str, optional: Removes the label text before the given string. ~ "? " deletes everything up until and including the question mark. Defaults to ''.
title_remove_after : str, optional: Removes the label text after the given string. ~ ". Bitte" deletes everything after and including the punctuation mark that. Defaults to ''.
label_size : float, optional: Sets the size of labels. Defaults to 16.
select_min_count : int, optional: Selects all bars with higher count than given. ~ 10 means that only bars with more than 10 observations will be selected. Defaults to 0.

Methods

def clip_coordinates(self, unclipped_shapes: Map_Scope_Enum)

clips coordinates of background to a preset for the scope of the map and returns the background shape

Args

scope : Map_Scope_Enum: Enum to define the preset (EU or CH)

def create_map_figure(self, foreground_shapes: geopandas.geodataframe.GeoDataFrame, clipped_foreground_shapes: geopandas.geodataframe.GeoDataFrame)

creates map figures and combines background and foreground shapes with each other

Args

foreground_shapes (:gpd.GeoDataFrame):geopandas Dataframe with the foreground shapes data clipped_foreground_shapes (:gpd.GeoDataFrame):geopandas Dataframe with the clipped foreground shapes data

Returns

go.Figure: plotly Figure with map objects

def create_maps(self) ‑> Optional[tuple]

Creates a map chart Returns: Optional[tuple]: returns df and meta objects if return_data is True

def create_maps_df(self) ‑> pandas.core.frame.DataFrame

creates dataframe for a map graphic

Returns: pd.DataFrame: returns pandas dataframe for map

def get_foreground_shapes(self, resolution='03') ‑> geopandas.geodataframe.GeoDataFrame

gets foreground shape for the maps

Args

resolution : str: resolution size of the standard "NUTS" shapefiles - possible values are 03,10,20, default is 03

Returns: gpd.GeoDataFrame: returns filtered geopandas dataframe by level and scope

def get_translations(self) ‑> str

gets translation json for maps and map objects for the map scope Returns: str: json string with all map objects

def init_shape_file(self, resolution, file=None) ‑> geopandas.geodataframe.GeoDataFrame

reads shapefile from the file system and loads it into a geopandas dataframe

Args

resolution : str: resolution size of the standard "NUTS" shapefiles - possible values are 03,10,20
file : str: file path to your own shapefile if you need it

Returns: gpd.GeoDataFrame: returns geopandas dataframe

def shape_clipping(self, foreground_shapes: geopandas.geodataframe.GeoDataFrame, exclusion_filter: list) ‑> geopandas.geodataframe.GeoDataFrame

removes shapes with a list of NUTS ID objects

Args

foreground_shapes : gpd.GeoDataFrame: shapes to be used in the map plot

exclusion_filter (list(str)): list of NUTS ID objects to remove Returns: gpd.GeoDataFrame: returns clipped geopandas dataframe