§ 1.2 Schema Best Practices
The use of characters and data types can be updated dynamically as the needs of the schema change. These two portions of the schema are under less onerous change management rules than the other portions. Therefore, coverage of them here is abbreviated.
Acceptable Characters
1.2.1 All characters used in federated compliance mapping data structures will conform to the GRCschema.org character table’s rules.
Mapping products must enter text into several different JSON data types. Some of those datatypes, such as term definitions for the dictionary, or URLs, only accept certain characters. Therefore, the federated compliance mapping model maintains a character table of acceptable characters to be used in the framework.
Instead of listing all the characters here (since this list can be updated from time to time), we’ll present one row of the table and explain that row. The full list is maintained online at https://grcschema.org/CharTable. The table below is a representation and is comprised of the following fields:
Char
Unicode
Dec
Hex
Binary
HTML
Definition
!
U+0021
33
21
100001
!
The grapheme for the exclamation mark character. It is represented by the following codes: Unicode U+0021; Dec 33; Hex 21; HTML !. This character is compatible with all known dictionary entries.
Char – the actual grapheme or special character code (such as TAB for a tab character) of the character in question.
Unicode through HTML – the various representations of the character in different formats.
Definition – describes how the character is used, and whether it should be substituted when using it in dictionary entries, URLs, or Markdown language.
Data Types
1.2.2 All data types used in federated compliance mapping model will conform to the datatypes documented at GRCschema.org.
There are various datatypes that each of the JSON objects must follow. The federated compliance mapping model has moved beyond the basic JSON datatypes of null, text, string, etc. and also allows specific datatypes for such things as HostName, TelephoneNumber, Datetime-with-Timezone, etc.
The list of datatypes is maintained online at http://grcschema.org/DataType.
Schema naming
Schema naming will follow a standardized approach to field and table names, and will also follow all restrictions for SQL keywords as SQL databases are the most used in the industry.
1.2.3 Schema property nodes start with a lowercase character.
1.2.4 Schema data types begin with an uppercase character.
1.2.5 Properties are lowercase.
1.2.6 DataTypes and Things are camelCase.
1.2.7 Properties may separate words with underscores “_”.
1.2.7.a Refrain from using spaces “ “ and dashes “-“ in any node name.
Reserved Keywords
1.2.7.b Refrain from using reserved keywords in any JSON “thing” or “datatype”.
SQL Servers use reserved keywords for defining, manipulating, and accessing databases. Re-served keywords are part of the grammar of the Transact-SQL language that is used by SQL Servers to parse and understand Transact-SQL statements and batches. While it is syntactically possible to use reserved keywords by adding special identifiers in Transact-SQL scripts, it is not advisable. Therefore, the following strings are forbidden as field, object, types, and table names.
TERM
TERM
TERM
ABSOLUTE
FETCH
PRIVILEGES
ACTION
FILE
PROC
ADD
FILLFACTOR
PROCEDURE
ALL
FIRST
PUBLIC
ALLOCATE
FLOAT
RAISERROR
ALTER
FOR
READ
AND
FOREIGN
READTEXT
ANY
FOUND
REAL
ARE
FREETEXT
RECONFIGURE
AS
FREETEXTTABLE
REFERENCES
ASC
FROM
RELATIVE
ASSERTION
FULL
REPLICATION
AT
FUNCTION
RESTORE
AUTHORIZATION
GET
RESTRICT
AVG
GLOBAL
RETURN
BACKUP
GO
REVERT
BEGIN
GOTO
REVOKE
BETWEEN
GRANT
RIGHT
BINARY
GROUP
ROLLBACK
BIT
HAVING
ROUND
BIT_LENGTH
HOLDLOCK
ROW
BLOB
HOUR
ROWCOUNT
BOOLEAN
IDENTITY
ROWGUIDCOL
BOTH
IDENTITY_INSERT
ROWID
BREAK
IDENTITYCOL
ROWS
BROWSE
IF
RTRIM
BULK
IMMEDIATE
RULE
BY
IN
SAVE
CASCADE
INDEX
SCHEMA
CASCADED
INDICATOR
SCROLL
CASE
INITIALLY
SECOND
CAST
INNER
SECTION
CATALOG
INPUT
SECURITYAUDIT
CHAR
INSENSITIVE
SELECT
CHAR_LENGTH
INSERT
SEMANTICKEYPHRASETABLE
CHARACTER
INT
SEMANTICSIMILARITYDETAILSTABLE
CHARACTER_LENGTH
INTEGER
SEMANTICSIMILARITYTABLE
CHECK
INTERSECT
SESSION
CHECKPOINT
INTERVAL
SESSION_USER
CHR
INTO
SET
CLOSE
IS
SETUSER
CLUSTERED
ISOLATION
SHUTDOWN
COALESCE
JOIN
SIZE
COLLATE
KEY
SMALLINT
COLLATION
KILL
SOME
COLUMN
LANGUAGE
SPACE
COMMIT
LAST
SQL
COMPUTE
LEADING
SQLCODE
CONNECT
LEFT
SQLERROR
CONNECTION
LENGTH
SQLSTATE
CONSTRAINT
LEVEL
STATISTICS
CONSTRAINTS
LIKE
STRVAL
CONTAINS
LINENO
SUBSTRING
CONTAINSTABLE
LOAD
SUM
CONTINUE
LOCAL
SYSTEM_USER
CONVERT
LONGVARBINARY
TABLE
CORRESPONDING
LOWER
TABLESAMPLE
COUNT
LTRIM
TEMPORARY
CREATE
MATCH
TEXTSIZE
CROSS
MAX
THEN
CURDATE
MERGE
TIES
CURRENT
MIN
TIME
CURRENT_DATE
MINUTE
TIMESTAMP
CURRENT_TIME
MODULE
TIMESTAMPVAL
CURRENT_TIMESTAMP
MONTH
TIMEVAL
CURRENT_USER
MONTHNAME
TIMEZONE_HOUR
CURSOR
NAMES
TIMEZONE_MINUTE
CURTIME
NATIONAL
TO
CURTIMESTAMP
NATURAL
TODAY
DATABASE
NCHAR
TOP
DATE
NEXT
TRAILING
DATEVAL
NO
TRAN
DAY
NOCHECK
TRANSACTION
DAYNAME
NONCLUSTERED
TRANSLATE
DAYOFWEEK
NOT
TRANSLATION
DBCC
NULL
TRIGGER
DEALLOCATE
NULLIF
TRIM
DEC
NUMERIC
TRUNCATE
DECIMAL
NUMVAL
TRY_CONVERT
DECLARE
OCTET_LENGTH
TSEQUAL
DEFAULT
OF
UNION
DEFERRABLE
OFF
UNIQUE
DEFERRED
OFFSET
UNKNOWN
DELETE
OFFSETS
UNPIVOT
DENY
ON
UPDATE
DESC
ONLY
UPDATETEXT
DESCRIBE
OPEN
UPPER
DESCRIPTOR
OPENDATASOURCE
USAGE
DIAGNOSTICS
OPENQUERY
USE
DISCONNECT
OPENROWSET
USER
DISK
OPENXML
USERNAME
DISTINCT
OPTION
USING
DISTRIBUTED
OR
VALUE
DOMAIN
ORDER
VALUES
DOUBLE
OUTER
VARBINARY
DROP
OUTPUT
VARCHAR
DUMP
OVER
VARYING
ELSE
OVERLAPS
VIEW
END
PAD
WAITFOR
END_EXEC
PART
WHEN
ERRLVL
PARTIAL
WHENEVER
ESCAPE
PERCENT
WHERE
EVERY
PIVOT
WHILE
EXCEPT
PLAN
WITH
EXCEPTION
POSITION
WITHIN GROUP
EXEC
PRECISION
WORK
EXECUTE
PREPARE
WRITE
EXISTS
PRESERVE
WRITETEXT
EXIT
PRIMARY
YEAR
EXTERNAL
ZONE
EXTRACT
PRIOR
Schema Structure
Establishing any data format for an API is a common problem – especially when trying to make the API easy to use so anyone can use short strings to reference common things, while also providing a robust interchange format to use in federated projects. Leveraging information from multiple APIs becomes problematic due to namespace or document format conventions that may differ between API endpoints. Moreover, the same principles are often repeated across different endpoints using arbitrary identifiers (name, email, website, etc.); a federated project community needs to stop repeating itself (DRY concept) and reuse common conventions. This model is based on the principles of separation of data model from syntax, the use of discoverable identifiers describing document contents, and general organizing principles that allow documents to be machine understandable (read, interpreted as JSON-LD using Linked Data). Key among these is the notion of vocabulary re-use, so that each endpoint does not need to separately describe the properties and structure of their JSON documents.
Schema Context
1.2.8 All keys used within a JSON document can have unambiguous meaning, being bound to URLs which describe their meaning.
1.2.8.a The context for the schema structure will be defined at grcschema.org, with the top level context being “Thing”.
An example of the bound, unambiguous context is shown below that uses IRIs (Internationalized Resource Identifiers as described in RFC3987) for unambiguous identification. The idea is to use IRIs to assign unambiguous identifiers to data that may be of use to other developers:
{ "@context":{ "schema": "
http://grcschema.org/
", "rdf": "
http://www.w3.org/1999/02/22-rdf-syntax-ns#
", "rdfa": "
http://www.w3.org/ns/rdfa#
", "rdfs": "
http://www.w3.org/2000/01/rdf-schema#
" }, "@id": "Thing", "@type": "schema:Thing", "rdfs:comment": "The most generic type of item.", "rdfs:label": "Thing" }
1.2.9 Each JSON object will be placed in a hierarchy with a single top-level object (type) with cascading, subordinate objects (types) as necessary.
1.2.10 Multiple array values are presumed unordered.
1.2.11 Every JSON object will have core meta data (
coreMetaData
) associated with it.
1.2.11.a Core meta data will include date_created to record the creation date of the record.
1.2.11.b Core meta data will include date_modified to record the date the record was last modi-fied (this will be the same as the creation date when an object is first created).
1.2.11.c Core meta data will include created_by to record the id of the person or agent that cre-ated the object.
1.2.11.d Core meta data will include modified_by to record the id of the person or agent that last modified the object.
1.2.11.e Core meta data will include a live status (live_status) that is a Boolean value, and if the object is no longer live, it will not be removed from the database but only marked as dep-recated (Boolean value of 0).
1.2.11.f Core meta data will include the name of the table that contains the object.
When creating the checksum for the RESTful record validation, the Verhoeff calculation will be used to create the initial checksum from a combination of these three fields:
date_modified: This is the most recent modification date of the record.
id: This is the ID field of the record in question.
live_status: This is a Boolean field indicating if the record is live (1) or deprecated (0).
In combination, these three tell the user which record they are dealing with, when it was last modified, and whether it remains live and ready for use.
The checksum will return a dihedral check integer of those three values that can be used on the receiving end to verify that the record of that ID was last modified on that date and has that live status.
When creating or updating a record in the system
When creating or updating a record, the system will calculate the checksum based on the input and will record the checksum in the checksum node. Given the modification_date of 6/17/2020, ID of 1 and live_status of 1, Verhoeff ( 6/17/2020 & 1 & 1 ; 0 ; 0 ) returns the value 5.
Testing the checksum
Upon receiving data from the system, the user will be to calculate a boolean value by simply reversing the calculation. Given the modification_date of 6/17/2020, ID of 1 and live_status of 1, NOT Verhoeff ( 6/17/2020 & 1 & 1 ; 0 ; 0 ) returns the value 1. Modification of any of the values (date, ID, live_status) returns 0.
The checksum calculation
Returns the Verhoeff dihedral check digit of numericString. Use this function to verify a numeric string protected by Verhoeff check digit, or to generate the correct Verhoeff check digit for a given numeric string.
Format
Verhoeff ( numericString ; index ; checkSum )
Parameters
numericString - a string of numeric characters (digits) or field containing numeric characters. In this case, the string is comprised of the following: date_created & id & live_status.
index - indicates the digit position of the current iteration - needs to be initialized to zero (0) when calling the function.
checkSum - indicates the check digit of the current iteration. This needs to be initialized to zero (0) when calling the function
Data type returned = number
Calculation
Let ( [ n = Right ( numericString ; 1 ) ; p = Let ( [ array = "01234567891576283094580379614289160435279453126870428657390127938064157046913258" ; start = 10 * Mod ( index ; 8 ) + n + 1 ] ; Middle ( array ; start ; 1 ) ) ; d = Let ( [ array = "0123456789123406789523401789563401289567401239567859876043216598710432765982104387659321049876543210" ; start = 10 * checkSum + p + 1 ] ; Middle ( array ; start ; 1 ) ) ; len = Length ( numericString ) ; nextString = Left ( numericString ; len - 1 ) ] ; Case ( len > 1 ; Verhoeff ( nextString ; index + 1 ; d ) ; d ) )
1.2.11.g Core meta data will include a Verhoeff checksum (checksum) that is a positive integer.
1.2.12 All nested arrays will include the unique identifier (id) for each nested object as well as the id of the related table in the format of the foreign table name followed by an underscore and "fk" (foreigntablename_fk).
An example of a nested table with its id as well as the id of the related table is shown below.
{ "@context": { "schema": "http://grcschema.org/", "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#", "rdfa": "http://www.w3.org/ns/rdfa#", "rdfs": "http://www.w3.org/2000/01/rdf-schema#" }, "id": "96", "@type": "Person", "name": "Barack Obama", "givenName": "Barack", "familyName": "Obama", "jobTitle": "44th President of the United States", "spouse": { "id": "13133", "@type": "Person", "name": "Michelle Obama", "Person_fk": "96" } }
§ 1.2.13 When using a property intended to reference another entity or external reference, those properties will be represented as full URLs.
Specification of Arrays
Arrays are specified by the keyword @set followed by JSON’s array brackets “[ ]”. The objects that follow denote the individual objects (fields) within the array.
"@set": [ { "@type": "NonStandardName", "name": "String", "id": "Integer", "organization_fk": "Integer" } ]
§ 1.2.14 When including arrays within objects, the @set will precede the array brackets “[ ]”.
Specification of Types
The @type keyword is used to set the type of a node or the datatype of a typed value (a value, which is a string, and a type, which is an IRI). In Linked Data, it is common to specify the type of a graph node; in many cases, this can be inferred based on the properties used within a given node object, or the property for which a node is a value.
1.2.15 The use of @type immediately following @context sets the 5.15 The use of @type immediately following @context sets the name of the table.
An example of this is shown below:
"@context":"http://grcschema.org/" , "@type":"Role"
1.2.16 When @type follows an object name and isn’t preceding an array it names a predefined group of fields.
Example below:
"coreMetaData": { "@type": "CoreMetaData" "date_created":"2014-10-05" "date_modified":"2014-10-05" "live_status":1 "table":"Language" }
1.2.17 When @type follows an object name and precedes an array it names the related table that should be added.
Example below:
"nonStandardNames": { "@type":"NonStandardNames" "Names": [ 0: { "id":1 "name":"افغانستان" } ] }
Here is an example where you’ve got @type meaning all of the above.
"geoLocation": { "@type":"GeoLocation" <- That describes this grouping of data "borders": { "@type":"Borders" <— That describes the related subtable and fields "border": [...] } "term": { "@type":"Term" <- That just describes a field from a related table "term.id":32131 } "callingCodes": { "@type":"CallingCodes" <— That describes the related subtable and fields "calling_codes": [ 0: {... } ] "timeZones": {... } "latitude":"33" "longitude":"65" }
Local Reference Identifiers
When creating new records (POST), or updating existing records (PATCH), an additional level of validation and non-repudiation is presented as an optional key-value pair. We suggest using a hash-based algorithm such as an md5 hash based on the contents of the fields being passed to the API for creation or updating, such as the one below:
local_reference_id": "81dc9bdb52d04dc20036dbd8313ed055"
§ 1.2.19 For POST and PATCH operations, you may send an optional local_reference_id for the core object and any sub-objects in unordered lists (\@set arrays). This lo-cal_reference_id will be returned (like an echo) for that object.
§ 1.2.20 It is recommended you use a Type 4 UUID which is unique for that object in your sys-tem and tie the federated ID to your record for later use in querying the federated system for that object.
POST Operation Example
For the core object, you may place a local_reference_id at the object root level, and it will be re-turned in the response.
SEND
{ "@context": "http://grcschema.org/", "@type": "NameSuffix", "abbreviation": "Adm", "local_reference_id": "81dc9bdb52d04dc20036dbd8313ed055", "suffix": "Admiral" }
RESPONSE
{ "@context":"http://grcschema.org/", "@type": "NameSuffix", "abbreviation": "Adm", "id": 2, "suffix": "Admiral", "local_reference_id": "81dc9bdb52d04dc20036dbd8313ed055" }
PATCH Operation Example
For the core object, you may place a local_reference_id at the object root level, and it will be returned in the response. Do not add local_reference_id to the container objects like CoreMetaDa-ta.
SEND & RESPONSE
{ "@context":"http://grcschema.org/", "@type": "NameSuffix", "CoreMetaData": { "@type": "CoreMetaData", "date_created": "Date", "date_modified": "Date", "created_by": "Integer", "modified_by": "Integer", "live_status": "Boolean", "checksum": "Integer", "validated": "Boolean" }, "abbreviation": "Adm.", "id": 2, "suffix": "Admiral", "local_reference_id": "81dc9bdb52d04dc20036dbd8313ed055" }
Last updated