This document outlines the technical details of the Database Access Suite as introduced in Third Party Data Access and Analytics Integration

What is the Database Access Suite?

The core mechanism behind this suite is providing access via a TLS/SSL-secured connection to a set of secure MySQL/MariaDB views on a read-replica of the Link2Feed Intake database systems. The connection, data formatting, and access is all managed through existing MySQL standards that can connect seamlessly into a large number of data tools.

Data is available either anonymized, or fully identified. Access security is managed through MySQL standard tools and an IP address allow-list.

The system is useful as either a fully queryable data access tool, or as a base point for a synchronization system.

How Do I Access It?

Reach out to your Link2Feed Account manager, who will work to amend your current contract and coordinate with the Link2Feed support team to do the initial setup.

The Link2Feed support team will provide connection details including an access endpoint URL, a username and a password.

Documentation

Entity Relationship Diagram (ERD)

These documents are a series of Entity Relationship Diagrams based on the MySQL VIEWs that the Database Access Suite provides. This is available in both PNG format as a large image, and also in a MySQL Workbench format (a free software).

MySQL Workbench can be downloaded at: https://dev.mysql.com/downloads/workbench/

These sample files have the prefix hfb_ but in a test or production environment these prefixes would be different. All other parts of the table names would remain the same.

MySQL DDL Queries for Creating Replica Datastore

This document is a series of MySQL DDL queries for creating tables that mirror the format and relationship of the database views that the access suite can reach. It includes a prefix on each table name. This mirrors our deployment pattern, where each client is given a short code to represent their food bank (eg- St. Mark’s Food Bank might be shortened as smfb). In the sample file, this prefix is l2f_test_, and a new code would be provided for your food bank in its place.

Database Fields Summary

Note:  Table names are marked with a sample prefix. This prefix varies per-client. For example: Acme Food Bank may be given tables named afb_program_visit_activities, where afb is the prefix. In these examples, for consistency, the prefix is l2f_test_.

Core Visit Tables (Every Recorded Visit Creates These)

l2f_test_program_visit_activities

 

l2f_test_program_visit_household_addresses

 

l2f_test_program_visit_household_members

 

l2f_test_program_visit_household_dietary_considerations

 

l2f_test_program_visit_household_expenses

 

l2f_test_program_visit_household_income

 

l2f_test_program_visit_household_languages

 

l2f_test_program_visit_household_member_contact_info

 

l2f_test_program_visit_household_member_ethnicity_types

 

l2f_test_program_visit_household_member_self_identity_types

 

l2f_test_program_visit_household_member_social_program_types

Core Metadata (Every Recorded Visit References These)

l2f_test_countries

 

l2f_test_data_types

 

l2f_test_languages

 

l2f_test_modules

 

l2f_test_organizations

 

l2f_test_programs

 

l2f_test_provisions

 

l2f_test_workers

Non-Core Tables (Not Every Visit Creates These)

l2f_test_client_signature_records

l2f_test_household_notes

 

l2f_test_program_bulk_log

 

l2f_test_program_visit_activity_properties

 

l2f_test_program_visit_activity_provisions

 

l2f_test_program_visit_activity_referrals

 

l2f_test_program_visit_activity_signature_records

 

l2f_test_program_visit_household_notes

 

l2f_test_program_visit_signature_records

 

l2f_test_program_visit_visit_reasons

Very Specific Modules (Very Few Visits Create These)

l2f_test_program_visit_activity_delivery_details

 

l2f_test_program_visit_activity_external_referrals

 

l2f_test_program_visit_ffc_info

 

l2f_test_program_visit_household_member_school_attendance

 

l2f_test_program_visit_voucher_transactions

 

l2f_test_program_visit_vouchers

 

l2f_test_schools

Incremental Syncs and the Hard Delete Log

All entries in the Database Access Suite include a record for updated_at which can be used to do incremental pulls. For example, SELECT * FROM l2f_test_program_visit_activities WHERE updated_at > '2022-01-01' will pull only those records updated since Jan 1st, 2022.

However, the Link2Feed system was originally built with hard-deletes (eg- a visit recorded in error is removed entirely from the database when it is deleted by a user.) This leads to gaps and sync issues when pulling incremental data from the system, as there is no deleted_at or updated_at for these records, they are simply gone.

To provide insight into this process, Link2Feed has given each Database Access Suite user access to the hard_delete_log table, note that this table does not have a prefix. That table has the following columns:

This allows for two processes to be setup:

  1. An incremental pull of every table, once per day, for example:

    SELECT * FROM l2f_test_program_visit_activities WHERE updated_at > DATE_SUB(NOW(), INTERVAL 1 DAY);
  2. An incremental check of records on that table that have been deleted:

    SELECT CONCAT('l2f_test', table_name) AS table_name, id, deleted_at FROM hard_delete_log WHERE table_name = 'program_visit_activities' AND deleted_at > DATE_SUB(NOW(), INTERVAL 1 DAY);

Sample Queries Against Database Access Suite

The “trunk” of the tree is the l2f_test_program_visit_activities table. Each record in this table represents a single row in the Link2Feed exports that your team has been receiving from the group.

 

Attached to each of those are a number of l2f_test_program_visit_household_members records, referring back by program_visit_activity_id, and these represent each of the clients. One of these records will be for the main client, which can be identified as their `client_id` matches the `client_id` of the l2f_test_program_visit_activities record.

 

There are also several tables like l2f_test_program_visit_household_member_ethnicity_types (and _self_identity_types, etc) which are demographic fields tied to individuals. They link to both an individual on a visit by a program_visit_household_member_id and also to an ethnicity_type_id that links over to the l2f_test_data_types table.

These relationships can be gleaned from the SQL and ERD documents, but I admit it’s hard to separate out the important bits.

So let’s look at a real example: finding how many duplicated clients with Hispanic/Latino heritage visited in 2020.

That would be:

SELECT COUNT(*) AS number_of_latino_visitors
FROM l2f_test_program_visit_household_member_ethnicity_types AS pvhmet
LEFT JOIN l2f_test_data_types AS dt
  ON pvhmet.ethnicity_type_id = dt.id
LEFT JOIN l2f_test_program_visit_household_members AS pvhm
  ON pvhmet.program_visit_household_member_id = pvhm.id
LEFT JOIN l2f_test_program_visit_activities AS pva
  ON pvhm.program_visit_activity_id = pva.id
WHERE dt.name = 'hispanic_latino'
  AND pva.visit_date BETWEEN '2020-01-01' AND '2020-12-31'; 

And then let’s say we want the unique count, we would pull a COUNT(DISTINCT pvhm.client_id) instead. The client_id field is the consistent unique identifier for clients. That would be:

SELECT COUNT(DISTINCT pvhm.client_id) AS number_of_unique_latino_visitors
FROM l2f_test_program_visit_household_member_ethnicity_types AS pvhmet
LEFT JOIN l2f_test_data_types AS dt
  ON pvhmet.ethnicity_type_id = dt.id
LEFT JOIN l2f_test_program_visit_household_members AS pvhm
  ON pvhmet.program_visit_household_member_id = pvhm.id
LEFT JOIN l2f_test_program_visit_activities AS pva
  ON pvhm.program_visit_activity_id = pva.id
WHERE dt.name = 'hispanic_latino'
  AND pva.visit_date BETWEEN '2020-01-01' AND '2020-12-31';

Digging deeper, let’s say we wanted to group those visits into how many clients are in the household. This is where we get a little grey-area and domain-specific. There are two important numbers that Link2Feed tracks: 1) the number of people who were actually recorded as part of the visit, and checked off as attending (which some of our users utilize very effectively, and some do not) and 2) the number of people who live in the household, whether or not they are part of the visit. #1 is grouping by how many l2f_test_program_visit_household_members records relate to the visit. #2 is grouping by the l2f_test_program_visit_activities.total_members field.

So that is either:

#1 – based on how many visitors were on the visit:

SELECT
  COUNT(*) AS number_of_latino_visitors,
  (
    SELECT COUNT(*)
    FROM l2f_test_program_visit_household_members
    WHERE program_visit_activity_id = pva.id
  ) AS visit_members
FROM l2f_test_program_visit_household_member_ethnicity_types AS pvhmet
LEFT JOIN l2f_test_data_types AS dt
  ON pvhmet.ethnicity_type_id = dt.id
LEFT JOIN l2f_test_program_visit_household_members AS pvhm
  ON pvhmet.program_visit_household_member_id = pvhm.id
LEFT JOIN l2f_test_program_visit_activities AS pva
  ON pvhm.program_visit_activity_id = pva.id
WHERE dt.name = 'hispanic_latino'
  AND pva.visit_date BETWEEN '2020-01-01' AND '2020-12-31'
GROUP BY visit_members;

Or #2 – based on how many people live in the visitors’ household:

SELECT
  COUNT(*) AS number_of_latino_visitors, pva.total_members
FROM l2f_test_program_visit_household_member_ethnicity_types AS pvhmet
LEFT JOIN l2f_test_data_types AS dt
  ON pvhmet.ethnicity_type_id = dt.id
LEFT JOIN l2f_test_program_visit_household_members AS pvhm
  ON pvhmet.program_visit_household_member_id = pvhm.id
LEFT JOIN l2f_test_program_visit_activities AS pva
  ON pvhm.program_visit_activity_id = pva.id
WHERE dt.name = 'hispanic_latino'
  AND pva.visit_date BETWEEN '2020-01-01' AND '2020-12-31'
GROUP BY pva.total_members;

The second statistic is much more useful for CSFP visits, as CSFP visits are recorded with only one individual on them, so the only way to get those household details on CSFP visits is to use the total_members field.

On that note: One of the things both Reports and Exports within Link2Feed’s Intake system do is allow restricting by program type. To do that, we need to link in the l2f_test_programs table and restrict on its type field.

SELECT
  COUNT(*) AS number_of_latino_visitors, pva.total_members
FROM l2f_test_program_visit_household_member_ethnicity_types AS pvhmet
LEFT JOIN l2f_test_data_types AS dt
  ON pvhmet.ethnicity_type_id = dt.id
LEFT JOIN l2f_test_program_visit_household_members AS pvhm
  ON pvhmet.program_visit_household_member_id = pvhm.id
LEFT JOIN l2f_test_program_visit_activities AS pva
  ON pvhm.program_visit_activity_id = pva.id
LEFT JOIN l2f_test_programs AS p
  ON pva.program_id = p.id
WHERE dt.name = 'hispanic_latino'
  AND pva.visit_date BETWEEN '2020-01-01' AND '2020-12-31'
  AND p.type = 'csfp_food_bank'
GROUP BY pva.total_members;

Video Demonstrations

Connecting with MySQL Workbench and MySQL Command Line Interface (CLI)

Connecting with DBeaver

Connecting and Using Tableau Desktop

Connecting with Azure Data Factory