Module functions.tools.metatable_old
Classes
class MetaTable (df: pandas.core.frame.DataFrame, meta: pyreadstat._readstat_parser.metadata_container, project_name: str, year: str = '2023')
-
creates MetaTable object that makes data wrangling of files read with pyreadstat easy
Args
df
:pd.DataFrame
- DataFrame read with pyreadstat
meta
:pyreadstat._readstat_parser.metadata_container
- pyreadstat metadata
project_name
:str
- The name of the project. Is used as a folder name for the generated files
gfs_meta
:dict
- gfs specific meta data that is used in other gfs products
Methods
def create_column_copy(self, new_column: str, column: str, copy_values: bool)
-
Creates a new column that's a copy of another column in the MetaTable
Args
new_column
:str
- Name of the new column
column
:str
- Name of the column to be copied
copy_values
:bool
- If True it copies the dataframe values, if False, the column will be full of np.NaN
def create_empty_columns(self, columns: list[str], label: str, variable_labels: dict[int, str])
def export_config(self)
-
Exports an excel-file that makes changing the meta data very simple.
def import_config(self)
-
Imports the excel-file with the changed meta data and changes the MetaTable accordingly.
def recode(self, columns: Union[list[str], dict[str, str]], values: dict[int, typing.Any], keep_untouched_codes=True)
-
recodes chosen columns
Args
columns
:Union[list[str], dict[str, str]]
- can either be a list of the columns that should get new codes list ['F5_01', 'F5_02'] or a dictionary with columns that should be kept unchanged as keys and new columns as values like {'F5_01': 'F5_01_rec', 'F5_02': 'F5_02_rec'}
values
:dict[int, Any]
- a dictionary with new variable labels as keys and variable labels that should be recoded as values like {1: [1, 2], 2: range(3, 6), 3: 6}. This means that the old codes 1 and 2 become the new code 1, the codes 3, 4 and 5 become the new code 2, and the code 6 becomes the new code 3.
keep_untouched_codes
:bool
, optional- For example: When a code 7 exists, but is not changed with the "values" argument, it is kept as it was if True. If False, all answers with code 7 will set to Null and the label for the code 7 will be deleted. Defaults to True.
def rename_columns(self, columns: dict[str, str])
-
Renames the given columns.
Args
columns
:dict[str, str]
- Columns to rename: {'old_name': 'new_name'}
def return_components(self) ‑> tuple[pandas.core.frame.DataFrame, pyreadstat._readstat_parser.metadata_container, dict]
-
Returns the updated DataFrame, the metadata that are contained within the object and the gfs metadata
Returns
tuple[pd.DataFrame, pyreadstat._readstat_parser.metadata_container, dict]
- returns tuple with objects
def scale_level(self, columns: list[str], new_scale: str)
-
Changes the scale levels of the given columns.
Args
columns
:list[str]
- A list of columns
new_scale
:str
- The new scale of the given columns. Can be 'nominal', 'scale' or 'ordinal'
def select_columns(self, columns: list)
-
selects the given columns and removes all others.
Args
columns
:list
- Columns to select
def show(self, columns: Union[list[str], str] = 'last changed', only_label: bool = False, show_objects: bool = False, total: bool = False)
-
Shows info about the value labels and variable label of the given columns.
Args
columns
:Union[list[str], str]
- Columns to be shown. If no column is passed in the last changed columns will be used. columns can either be a list of column names or a single column name as a string.
only_label
:bool
- Only shows the variable label
show_objects
:bool
- Also shows the python objects (makes copying easier)
total
:bool
- makes some changes to output for the show_all function. Not useful to the enduser.
def show_all(self, columns: Union[list[str], str] = 'last changed')
-
Shows info about the value labels and the meta data of the given columns.
Args
columns
:Union[list[str], str]
- Columns to be shown. If no column is passed in the last changed columns will be used. columns can either be a list of column names or a single column name as a string.
def show_meta(self, columns: Union[list[str], str] = 'last changed', total: bool = False)
-
Shows info about the meta data of the given columns.
Args
columns
:Union[list[str], str]
- Columns to be shown. If no column is passed in the last changed columns will be used. columns can either be a list of column names or a single column name as a string.
total
:bool
- makes some changes to output for the show_all function. Not useful to the enduser.
def val_lab(self, columns: list[str], labels: Union[dict[int, str], str], keep_untouched_codes=False)
-
Changes the value labels of the given columns
Args
columns
:list[str]
- A list of columns that need new value labels
labels
:Union[dict[int, str], str]
- A dictionary with new labels {1 : "label for code 1", 2: "label for code 2"} or the variable name with the labels to be used "variable_name"
keep_untouched_codes
:bool
, optional- This will keep the old labels of the column and just add the new ones instead of replacing all labels. Defaults to False.
def var_lab(self, columns: list[str], text: str)
-
Changes the variable labels of the given columns
Args
columns
:list[str]
- A list of columns that need a new variable label
text
:str
- Text of the variable label
def write_sav(self, path: str)
-
Writes a sav file of the MetaTable object which includes a Dataframe and the metadata.
Args
path
:str
- Path of the new file, includes the file name. To save the file in the package directory use "./filename.sav"