Skip to main content

Multiple Interlinked CSVs

When importing multiple CSV files that reference each other through various IDs, you may want more granular control.

Instead of looking at and modifying the YAML file in a text editor:

You can use our graphical ImGui interface!

If you're using MacOS, make sure you use Python 3.9 (as of March 2022, Python 3.10 does not play nicely with ImGui).

In this example, we'll use the Northwind dataset. We've made the zipped CSV files available TODO: link northwind.zip

Also check out our blog post on SQL imports.

Install ImGui and SDL2

pip3 install "imgui[sdl2]"  

# On MacOS:
brew install SDL2

# On other systems, install with your package manager, apt, pacman, yum, etc...

Generate YAML file

Unzip the CSV files into a new empty directory and then start your interpreter
from this directory. Then:

from zef import *  
from zef.ops import *
from zef.experimental import sql_import

decl = sql_import.guess_csvs("*.csv")
decl | save_file["sql_import.yaml"] | run

Open YAML file with ImGui interface

python3 -m zef.experimental.sql_ui.wizard sql_import.yaml  

If using MacOS, make sure you use Python 3.9 (as of March 2022) and are using the Python 3.9 Zef wheel.

python3.9 -m zef.experimental.sql_ui.wizard sql_import.yaml  

Change Entity names from plural to singular

Change cross-referenced IDs from Field to Entity and rename their RTs

See your options through the dropdown (make sure the Entity names are edited first to see the correct options).

Change AET types

Change Entities to Relations


Set the correct Source and Target.

Save your edited YAML file

The edited YAML results will save to the current filename.

If needed, we may add an option to save to a new filename in the future.

Load edited YAML file into Zef

If you have appropriately configured the importer you should be able to run the
following code to perform the actual import:

decl = "sql_import.yaml" | load_file | run | get["content"] | collect  
g = Graph()
actions = sql_import.import_actions(decl)
actions | transact[g] | run

If you would like, you can use our pre-configured file for this dataset: TODO: northwind_example.yaml

See the results of your import!

yo(g)  

<...snip...>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Atomic Entities ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[5993 total, 5993 alive] AET.String
[5217 total, 5217 alive] AET.Float
[3496 total, 3496 alive] AET.Int
[2487 total, 2487 alive] AET.Time

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Entities ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[11 total, 11 alive] ET.Employee
[93 total, 93 alive] ET.Customer
[9 total, 9 alive] ET.EmployeeTerritory
[53 total, 53 alive] ET.Territory
[4 total, 4 alive] ET.Region
[830 total, 830 alive] ET.Order
[3 total, 3 alive] ET.Shipper
[77 total, 77 alive] ET.Product
[29 total, 29 alive] ET.Supplier
[8 total, 8 alive] ET.Category

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Relations ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[4 total, 4 alive] RT.RegionDescription
[4 total, 4 alive] (ET.Region, RT.RegionDescription, AET.String)
[2155 total, 2155 alive] RT.OrderDetails
[2155 total, 2155 alive] (ET.Order, RT.OrderDetails, ET.Product)
[8 total, 8 alive] RT.ReportsTo
[8 total, 8 alive] (ET.Employee, RT.ReportsTo, ET.Employee)
<...snip...>