Part 7: Additional Tips for ISBLANK and ISNULL

Purpose

This is the seventh article in a series to improve ESSENCE queries. Free-text coding in ESSENCE, which is accessible to all users, follows distinct patterns. Learning to read these patterns allows users to take queries from many places and repurpose them to suit their unique needs. Syndromic surveillance depends heavily on sharing methods, and practitioners must understand the language.

Free Text Coding blue fractal with white text

Introduction

Please see Part 1: Wildcards for background information about the search criteria for ESSENCE free-text queries, which are built around Boolean logical operators and regular expressions.

The use of free-text queries sets syndromic surveillance apart from other surveillance methods. NSSP-ESSENCE allows free-text querying for all users, and the BioSense Platform is adaptable to different data sources and topics. Free-text queries help analysts and epidemiologists produce customized results. They can quickly code free-text queries and rapidly respond to outbreaks, disasters, and health events by adjusting queries as events unfold. Such capabilities empower users to customize queries to fit their level of data, ensuring more accurate results.

Free-text coding follows distinct patterns. Once users learn to read these patterns, they can repurpose queries used elsewhere to suit their unique needs. This increases the speed and efficiency with which one can create and share free-text queries. To repurpose queries others have created, practitioners must understand the language. This series provides in-depth coverage of the search criteria language for ESSENCE free-text queries.

Know the language‎

The search criteria for ESSENCE free-text queries are built around Boolean logical operators and may contain "^" for wildcards; "," for multiple entries; "ISBLANK" to look for blanks; "ISNULL" to look for nulls; [COMMA] to look for commas; and operators "and," "or," "andnot," and parentheses "()" to define order and grouping.

How NULL is different from BLANK

In Part 1. Wildcards, we covered the wildcard symbol for ESSENCE, which is the "^" symbol called a caret. A caret in a free-text ESSENCE query can represent a search for anything, everything, or nothing—all at the same time. We explain that a Chief Complaint free-text query of only a caret would return all visits, but this isn't true for every field. Some fields can be NULL, which is not returned when the free-text query is only a caret.

NULL is the absence of a data value in the field and is different from a BLANK. A BLANK can signify either that input was provided as BLANK ("", or just empty space) or that the system has converted a lack of input to a BLANK. However, to the system, a BLANK is a value, whereas a NULL (in data details, "none") always signifies the absence of input; hence, is no value. Referencing Part 1 of this series, a query for a "^" looks for any value in a field. Since BLANK is technically a value in SQL, a "^" will return fields with BLANKs as well as number, letter, and symbol values.

Some data fields can be NULL or have contents but never be BLANK. Likewise, some data fields can be BLANK or have contents but never a NULL value. Some examples are listed below.

Sample of common NSSP ESSENCE fields

NULL is the absence of a data value in the field
Can be BLANK Can be NULL
ChiefComplaints
Discharge Diagnosis
Triage Notes Orig
ICD Chapter Free Text
Discharge Diagnosis Parsed Free Text
CC and DD
CC and DD Category Free Text
Chief Complaint Orig
SubSyndrome Free Text

Why are some fields BLANK versus NULL? The reasons vary and are not always easy to identify.

In the example of Chief Complaint, this is the processed version of the first non-NULL value provided. If nothing is provided, the field is left BLANK.

Alternatively, in an example like the CC and DD Category Free Text, our queries look through all emergency department (ED) visits and label them with appropriate CC and DD Categories. If the ED visit doesn't qualify for any of these query definitions, no label is applied, and the field is "without value" or NULL.

The way in which ESSENCE processes data in some fields has changed slightly over the years. For example, relatively old Discharge Diagnosis fields with visit dates prior to 2018 could sometime be NULL.

Determining which fields do and do not have NULL or BLANK values can be tricky. If in doubt, it is always best to run a quick query to test. The NSSP–ESSENCE query for NULL is “ISNULL,” and the query for BLANK is “ISBLANK.”

Using ISNULL and ISBLANK

Although NULL and BLANK are important concepts to understand, querying for these values is neither a common nor general use of ESSENCE. Here are a few examples of queries using these concepts:

  • In a field that has BLANKs, querying the field for "ISBLANK" will show visits with this value
  • In a field that has NULLs, querying the field for "ISNULL" will show visits with a NULL value in the field
  • If querying in a field that can have NULLs, run the following query "ISNULL,or,^" to return all visits
  • In a field that has BLANKs where we want to see all BLANKs as well as values, we can simply query "^"
  • If we want to see all ED visits with contents in a field that can have BLANKS but exclude any BLANKs, then we would run "^,ANDNOT,ISBLANK"
  • If including a field that has NULLs and want only those visits that are not NULL in that field, we can run a query for "^"

If you are applying your query to multiple fields that could contain BLANK or NULL, you will need to use "ISNULL,or,^" to obtain all visits. It is also necessary to use ISNULL when negating visits that contain specific CC and DD categories or subsyndrome labels if you do not also want to simultaneously exclude visits that have a NULL value in those fields. In other words, if you want to keep all visits except those with a specific CCDD category or subsyndrome label in a query, you will need to use ISNULL.

Let's look deeper into an example that makes use of these concepts. The CCDD Category "Fever and Cough-SoB-DiffBr neg Influenza DD v1" captures visits with at least one of multiple subsyndrome categories and then negates records that contain "CDC Influenza DD v1" in the CCDD category free-text field. The syntax of the CCDD category free-text negation is as follows: "ISNULL,or,^,ANDNOT,(,^;CDC Influenza DD v1^,)". A verbal conversion of this negation syntax reads as:

Give me every visit with CCDD category free text that contains "nothing," OR give me all visits with CCDD category free text that contain "something," if that "something" doesn't contain "CDC Influenza DD v1."

If the exclusion were instead written as "^,ANDNOT,^; CDC Influenza DD v1^" , it would exclude visits with "CDC Influenza DD v1" but also exclude any visits that have NULL in the CCDD Category free-text field. To avoid removing these visits, ISNULL can be tacked onto the negation syntax.

A common mistake users make when experimenting with ISNULL statements is to apply ANDNOT statements to ISNULL by doing the following: "(,ISNULL,or,^,),ANDNOT,(,^;CDC Influenza DD v1^,)". This statement will result in visits with NULL not being captured. ANDNOT statements require that a field has contents; but since NULL is the absence of content, fields cannot be NULL and also not be something. This structure, called "three-valued logic," is not intuitive. You can avoid it by simply leaving ISNULL out of the ANDNOT parentheticals. To learn more about three-valued logic and the reasoning behind these NULL and BLANK values, please read Modern SQL: Three-Valued Logic (3VL)—Purpose, Benefits and Special Cases (modern-sql.com).

Three-valued logic‎

SQL uses a three-valued logic: Besides true and false, the result of logical expressions can also be unknown. SQL's three valued logic is a consequence of supporting null to mark absent data. If a null value affects the result of a logical expression, the result is neither true nor false but unknown.1

Summary

  • Some fields can be NULL, which is the absence of value, and some fields can be BLANK, which is a blank, and some fields can be BLANK or NULL (in addition to having a "normal" value).
  • The query "ISNULL" used in a free-text field will return NULLs; the query "ISBLANK" used in a free-text field will return BLANKS.
  • A query for "^" will pull BLANKs, and values, but not NULLs.
  • Never apply a negation to an ISNULL statement.

What's the key takeaway from this article? If you want to run a negation string only in a free-text field, this is the foolproof syntax, regardless of whether the field contains NULLs, BLANKS, both, or neither:

  • ISNULL,or,^,ANDNOT,(,^negation1^,or,^negation2^,or,.......,).

We thank Senior Data Analyst Zachary Stein for volunteering to write a series of articles about free-text coding. Stein does epidemiological work to support NSSP efforts and is an active participant in our Community of Practice (CoP). He initially wrote about free-text coding as an entry on the NSSP–CoP Syndrome Definition Committee forum. The forum generated considerable interest, inspiring this series. Stein acknowledges input provided by others who contributed to the forum post.