Handle Spreadsheet Import, Mapping, and Validation for Your Web App

When it comes to data, spreadsheets are incredibly useful and versatile. If your web app deals with any type of data  —  from sales pipelines to profit and loss statements, you’ve likely dealt with importing CSV files.

One of the first issues you run into with CSV uploads is the formatting of the data. For example:

  • What if the columns are named differently than what you want?
    ie., A column called Name instead of FullName
  • What if the data is formatted differently?
    ie., A date formatted MM-DD-YYYY instead of YYYY-DD-MM
  • What if some of the data is invalid?
    ie., An invalid phone number 123-456-789

Spreadsheet Cover

Because of these possibilities, you will realize you need some type of column mapping and validation abilities, so you only import valid data that your web app can understand.

For this example, I will be using the new gluestick library, which comes with two parts:

  • gluestick-elements: a set of React components that make it easy to build an intuitive import + validation experience for users.
  • gluestick-api: a Dockerized Python API that handles parsing, validating, and mapping the imported CSV data. It also allows you to send the data directly to a cloud service like AWS S3.

If you’d like, feel free to read the gluestick docs or join their Slack for more information. To give you some reference, the final result is shown below, and an interactive demo is available on CodeSandbox.

Final Mapping Flow

Final Mapping Flow

Without further ado, let’s jump in!

The Backend

Before we begin setting up the front end, let’s get the gluestick-api running on our local machine. Before doing this make sure you have Python and Docker installed and running.

For this example, we’ll use the gluestick CLI to get started, but if you’d like to do it manually you can follow the docs.

Install the CLI

Let’s start by installing the CLI, which is available on PyPi.

$ pip install gluestick

Install the Docker Image

Now we can pull the latest version of the gluestick-api and create the default config. I recommend doing this in a unique directory.

$ mkdir mygluestick-project
$ cd mygluestick-project
$ gluestick install
Created default gluestick-api configuration.
Pulling the gluestick-api Docker image...
Using default tag: latest
latest: Pulling from hotglue/gluestick-api
Digest: sha256:6d1a0fdbd884e252a5e6f7abf8f227366b7a1be4fd2ddae4cbd37fe4f217bbcf
Status: Image is up to date for hotglue/gluestick-api:latest
docker.io/hotglue/gluestick-api:latest
Latest gluestick-api Docker image pulled.

From here, you can now configure a target for your data such as AWS S3, but we’ll skip that part for now.

Start the API

Now let’s run the API. By default, it starts on port 5000 but you can change the port using the --port=$PORT option:

$ gluestick run
Starting gluestick-api...
[2021-04-07 20:30:22 +0000] [1] [INFO] Starting gunicorn 20.1.0
[2021-04-07 20:30:22 +0000] [1] [INFO] Listening at: http://0.0.0.0:5000 (1)
[2021-04-07 20:30:22 +0000] [1] [INFO] Using worker: sync
[2021-04-07 20:30:22 +0000] [9] [INFO] Booting worker with pid: 9

That’s it! Now we can move forward to configuring the frontend.

The Frontend

Now we can configure the gluestick-elements library in our React app.

If you’d like to follow along, the code for this example is available on CodeSandbox

Install the Package

Let’s install the package via npm

npm install --save gluestick-elements

Add the Element

Now we can add the React element to our project! Below is a simple example with the GlueStick component

Test the Element

Now that everything is running we can test out the whole flow! If you need some testing data, you can download a sample Leads.csv (link downloads Leads.csv file).

Final Mapping Flow 2
Final mapping flow

gluestick will do the following:

  1. Parse the input CSV file, and determine the available columns.
  2. Pick the nearest matching column names as a suggested mapping and run any validation.
  3. Show any invalid rows to the user and tell them what percentage of data has valid information.
  4. Preview the final data for the user.
  5. Send the data to its final destination (S3, Google Cloud Storage, etc.).

Conclusion

That’s all there is to it! The next step is customizing the schema of the file you want, and configuring any target you want.

If you’re interested in gluestick, I recommend taking a look at the docs.

I am more than happy to answer any questions below. Thanks for reading!



Source link

Write A Comment