# ETL and high performance computing

Midio has an **experimental** native package for using the [**polars data frame library**](https://pola.rs/) for handling large amounts of data.

## How to use it

Start by adding the **polars** package using the [package manager](https://docs.midio.com/midio-docs/package-manager).

Then, use one of the functions in the `Polars.Source` module, to import your data into the polars format. Supported formats include CSV, JSON, newline delimited JSON and querying a Postgres database.

These functions return a dataframe object, which can be operated on using either `Polars.Execute Sql` or `Polars.Execute Dynamic Sql`.

### Execute Sql

`Execute Sql` lets you provide a list of inputs using, and then executing an SQL query over those inputs. By default its accepts a single input.

<figure><img src="https://1896308808-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FRdFpuRAnTVYgmlCXLLou%2Fuploads%2FWMXXZySwDM6EodswSWMx%2Fimage.png?alt=media&#x26;token=49c3f0e2-f9cf-4c52-8ce5-448baa5483e4" alt=""><figcaption></figcaption></figure>

### Execute Dynamic SqlAbout the output

Works in a simliar way, but expects an object where the key is the name of the source, and the value is a data frame.

<figure><img src="https://1896308808-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FRdFpuRAnTVYgmlCXLLou%2Fuploads%2F11sLWEHoM0aKwLmA0bw2%2Fimage.png?alt=media&#x26;token=c0ebfd2f-ab85-4f88-9b79-f22f4b8fa1e8" alt=""><figcaption></figcaption></figure>

## Getting the results

After executing one or more queries, the results can be collected using the `Polars.Collect` function.

<figure><img src="https://1896308808-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FRdFpuRAnTVYgmlCXLLou%2Fuploads%2FFF2UkacWuyu38CjGJsAx%2Fimage.png?alt=media&#x26;token=7168c62c-7aba-419e-af92-301aee63dde4" alt=""><figcaption></figcaption></figure>

### About the output

The output is by default in a column-oriented format, meaning you get an object where each field represents a column, and its value is a list containing the values for each row.

<figure><img src="https://1896308808-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FRdFpuRAnTVYgmlCXLLou%2Fuploads%2FK41q4JaHjrcMAyRvSPN0%2Fimage.png?alt=media&#x26;token=2dbf4f4e-da01-49c6-95a7-0cc89f2e0dd8" alt=""><figcaption></figcaption></figure>

To get this data converted back to a list of objects, which is often a more useful format, you can use the `Transpose` function.

<figure><img src="https://1896308808-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FRdFpuRAnTVYgmlCXLLou%2Fuploads%2Flk2efgyj2kfcbssSfAJn%2Fimage.png?alt=media&#x26;token=4d84dc41-52f4-4e2d-887c-39b2dbddf4b2" alt=""><figcaption></figcaption></figure>
