Data Processor¶
CSV parsing, format detection, data transformation, and column normalization.
Classes¶
Functions¶
detect_data_format(df)
¶
Detect the format of uploaded data.
Returns one of: - "eval_runner": New evaluation runner output format with run_id, dataset_id, passed - "tree_format": Hierarchical metrics with parent relationships - "flat_format": Simple metric scores in long format - "simple_judgment": Binary pass/fail judgments - "fresh_annotation": Raw outputs for annotation - "unknown": Could not detect format
Source code in backend/app/services/data_processor.py
detect_tree_format(df)
¶
Check if the uploaded data is in tree format.
Source code in backend/app/services/data_processor.py
add_default_product(df)
¶
Add default metadata values.
Source code in backend/app/services/data_processor.py
add_columns_to_flat_format(df)
¶
Add empty columns to flat format dataset for tree visualization compatibility.
Source code in backend/app/services/data_processor.py
back_compatible_naming(df)
¶
Apply backwards-compatible column renames.
Source code in backend/app/services/data_processor.py
safe_literal_eval(val)
¶
setup_fresh_annotation(df_raw)
¶
Set up fresh annotation format.
Source code in backend/app/services/data_processor.py
process_uploaded_data(df_raw)
¶
Process uploaded data and return processed dataframe, format, and message.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df_raw
|
DataFrame
|
Raw DataFrame from uploaded file |
required |
Returns:
| Type | Description |
|---|---|
tuple[DataFrame | None, str | None, str]
|
Tuple of (processed_df, format_type, message) |
Source code in backend/app/services/data_processor.py
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 | |
convert_to_csv(annotations)
¶
Convert annotations to CSV format for download.
Source code in backend/app/services/data_processor.py
get_metrics_for_format(df, data_format)
¶
Get metrics based on data format.
Source code in backend/app/services/data_processor.py
identify_metric_component_mapping(df)
¶
Identify metrics and components in tree format data.
Source code in backend/app/services/data_processor.py
drop_latency(df)
¶
Drop Latency Column if passed through config.
Source code in backend/app/services/data_processor.py
convert_tree_to_wide_format(df, metric_type=None, include_conversation=False)
¶
Convert tree format to wide format for analytics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Input dataframe in tree format |
required |
metric_type
|
str | None
|
Optional metric type filter |
None
|
include_conversation
|
bool
|
Whether to include conversation column |
False
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Wide-format dataframe |
Source code in backend/app/services/data_processor.py
prepare_data_for_analytics(data, data_format, metric_type=None, include_conversation=False)
¶
Prepare data for analytics display.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
list[dict[str, Any]]
|
List or dict-like structure containing analytics data |
required |
data_format
|
str
|
'tree_format' or 'simple_judgment' |
required |
metric_type
|
str | None
|
Filter for specific metric type |
None
|
include_conversation
|
bool
|
Whether to include conversation column in output |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
Tuple |
tuple[DataFrame, list[str], dict[str, list[str]]]
|
(processed DataFrame, metric_columns list, mapping dict) |
Source code in backend/app/services/data_processor.py
parse_json(data)
¶
Parses a string that may be malformed, double-encoded, or a Python literal.
This version correctly handles python-specific values like nan, inf, and booleans.
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
A dictionary object. Returns an empty dict ({}) if all parsing attempts fail. |
Source code in backend/app/services/data_processor.py
process_database_data(df)
¶
Process data coming from database format.