Module functions.tools.metatable_old
Classes
class MetaTable (df: pandas.core.frame.DataFrame, meta: pyreadstat._readstat_parser.metadata_container, project_name: str, year: str = '2023')-
creates MetaTable object that makes data wrangling of files read with pyreadstat easy
Args
df:pd.DataFrame- DataFrame read with pyreadstat
meta:pyreadstat._readstat_parser.metadata_container- pyreadstat metadata
project_name:str- The name of the project. Is used as a folder name for the generated files
gfs_meta:dict- gfs specific meta data that is used in other gfs products
Methods
def create_column_copy(self, new_column: str, column: str, copy_values: bool)-
Creates a new column that's a copy of another column in the MetaTable
Args
new_column:str- Name of the new column
column:str- Name of the column to be copied
copy_values:bool- If True it copies the dataframe values, if False, the column will be full of np.NaN
def create_empty_columns(self, columns: list[str], label: str, variable_labels: dict[int, str])def export_config(self)-
Exports an excel-file that makes changing the meta data very simple.
def import_config(self)-
Imports the excel-file with the changed meta data and changes the MetaTable accordingly.
def recode(self, columns: Union[list[str], dict[str, str]], values: dict[int, typing.Any], keep_untouched_codes=True)-
recodes chosen columns
Args
columns:Union[list[str], dict[str, str]]- can either be a list of the columns that should get new codes list ['F5_01', 'F5_02'] or a dictionary with columns that should be kept unchanged as keys and new columns as values like {'F5_01': 'F5_01_rec', 'F5_02': 'F5_02_rec'}
values:dict[int, Any]- a dictionary with new variable labels as keys and variable labels that should be recoded as values like {1: [1, 2], 2: range(3, 6), 3: 6}. This means that the old codes 1 and 2 become the new code 1, the codes 3, 4 and 5 become the new code 2, and the code 6 becomes the new code 3.
keep_untouched_codes:bool, optional- For example: When a code 7 exists, but is not changed with the "values" argument, it is kept as it was if True. If False, all answers with code 7 will set to Null and the label for the code 7 will be deleted. Defaults to True.
def rename_columns(self, columns: dict[str, str])-
Renames the given columns.
Args
columns:dict[str, str]- Columns to rename: {'old_name': 'new_name'}
def return_components(self) ‑> tuple[pandas.core.frame.DataFrame, pyreadstat._readstat_parser.metadata_container, dict]-
Returns the updated DataFrame, the metadata that are contained within the object and the gfs metadata
Returns
tuple[pd.DataFrame, pyreadstat._readstat_parser.metadata_container, dict]- returns tuple with objects
def scale_level(self, columns: list[str], new_scale: str)-
Changes the scale levels of the given columns.
Args
columns:list[str]- A list of columns
new_scale:str- The new scale of the given columns. Can be 'nominal', 'scale' or 'ordinal'
def select_columns(self, columns: list)-
selects the given columns and removes all others.
Args
columns:list- Columns to select
def show(self, columns: Union[list[str], str] = 'last changed', only_label: bool = False, show_objects: bool = False, total: bool = False)-
Shows info about the value labels and variable label of the given columns.
Args
columns:Union[list[str], str]- Columns to be shown. If no column is passed in the last changed columns will be used. columns can either be a list of column names or a single column name as a string.
only_label:bool- Only shows the variable label
show_objects:bool- Also shows the python objects (makes copying easier)
total:bool- makes some changes to output for the show_all function. Not useful to the enduser.
def show_all(self, columns: Union[list[str], str] = 'last changed')-
Shows info about the value labels and the meta data of the given columns.
Args
columns:Union[list[str], str]- Columns to be shown. If no column is passed in the last changed columns will be used. columns can either be a list of column names or a single column name as a string.
def show_meta(self, columns: Union[list[str], str] = 'last changed', total: bool = False)-
Shows info about the meta data of the given columns.
Args
columns:Union[list[str], str]- Columns to be shown. If no column is passed in the last changed columns will be used. columns can either be a list of column names or a single column name as a string.
total:bool- makes some changes to output for the show_all function. Not useful to the enduser.
def val_lab(self, columns: list[str], labels: Union[dict[int, str], str], keep_untouched_codes=False)-
Changes the value labels of the given columns
Args
columns:list[str]- A list of columns that need new value labels
labels:Union[dict[int, str], str]- A dictionary with new labels {1 : "label for code 1", 2: "label for code 2"} or the variable name with the labels to be used "variable_name"
keep_untouched_codes:bool, optional- This will keep the old labels of the column and just add the new ones instead of replacing all labels. Defaults to False.
def var_lab(self, columns: list[str], text: str)-
Changes the variable labels of the given columns
Args
columns:list[str]- A list of columns that need a new variable label
text:str- Text of the variable label
def write_sav(self, path: str)-
Writes a sav file of the MetaTable object which includes a Dataframe and the metadata.
Args
path:str- Path of the new file, includes the file name. To save the file in the package directory use "./filename.sav"