Shahzad Bhatti Welcome to my ramblings and rants!

July 22, 2024

Security Challenges in Microservice Architecture

Filed under: Security,Web Services — admin @ 3:58 pm

Abstract

Microservice architectures have gained wide adoption due to their ability to deliver scalability, agility, and resilience. However, the distributed nature of microservices also introduces new security challenges that must be addressed proactively. Security in distributed systems revolves around three fundamental principles: confidentiality, integrity, and availability (CIA). Ensuring the CIA triad is maintained is crucial for protecting sensitive data and ensuring system reliability for MicroServices.

  • Confidentiality ensures that sensitive information is accessible only to authorized users using encryption, access controls, and strong authentication mechanisms [NIST SP 800-53 Rev. 5].
  • Integrity guarantees that data remains accurate and unaltered during storage or transmission using cryptographic hash functions, digital signatures, and data validation processes [Software and Data Integrity Failures].
  • Availability ensures that systems and data are accessible to authorized users when needed. This involves implementing redundancy, failover mechanisms, and regular maintenance [ISO/IEC 27001:2022].

Below, we delve into these principles and the practices essential for building secure distributed systems. We then explore the potential security risks and failures associated with microservices and offer guidance on mitigating them.

Security Practices

The following key practices help establish a strong security posture:

  • Strong Identity Management: Implement robust identity and access management (IAM) systems to ensure that only authenticated and authorized users can access system resources. [AWS IAM Best Practices].
  • Fail Safe: Maintain confidentiality, integrity and availability when an error condition is detected.
  • Defense in Depth: Employ multiple layers of security controls to protect data and systems. This includes network segmentation, firewalls, intrusion detection systems (IDS), and secure coding practices [Microsoft’s Defense in Depth].
  • Least Privilege: A person or process is given only the minimum level of access rights (privileges) that is necessary for that person or process to complete an assigned operation.
  • Separation of Duties: This principle, also known as separation of privilege, requires multiple conditions to be met for task completion, ensuring no single entity has complete control, thereby enhancing security by distributing responsibilities.
  • Zero Trust Security: Not trust any entity by default, regardless of its location, and verification is required from everyone trying to access resources. [NIST Zero Trust Architecture].
  • Auditing and Monitoring: Implement comprehensive logging, monitoring, and auditing practices to detect and respond to security incidents [Center for Internet Security (CIS) Controls].
  • Protecting Data in Motion and at Rest: Use encryption to protect data during transmission (data in motion) and when stored (data at rest). [NIST’s Guide to Storage Encryption Technologies for End User Devices].

Security Methodologies and Frameworks

Following practices ensure that security is integrated throughout the development lifecycle and that potential threats are systematically addressed.

  • DevSecOps: Integrate security practices into the DevOps process to shift security left, addressing issues early in the software development lifecycle.
  • Security by Design (SbD): Incorporate security by design process to ensure robust and secure systems [OWASP Secure Product Design]. The key principles of security by design encompass Memory safe programming languages, Static and dynamic application security testing, Defense-in-Depth, Single sign-on, Secure Logging, Data classification, Secure random number generators, Limit the scope of credentials and access, Address Space Location Randomization(ASLR) and Kernel ASLR (KASLR), Encrypt data at rest optionally with customer managed keys, Encrypt data in transit, Data isolation with multi-tenancy support, Strong secrets management, Principle of Least Privilege and Separation of Duties, and Principle of Security-in-the-Open.
  • Threat Modeling Techniques: Threat modeling involves identifying potential threats to your system, understanding what can go wrong, and determining how to mitigate these risks. Following threat model techniques can be used for identifying and categorizing potential security threats such as
    • STRIDE (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege) categorizes different types of threats.
    • PASTA (Process for Attack Simulation and Threat Analysis) a risk-centric threat modeling methodology [SEI Threat Modeling].
    • VAST (Visual, Agile, and Simple Threat) scales and integrates with Agile development processes.
  • CAPEC (Common Attack Pattern Enumeration and Classification): A comprehensive dictionary of known attack patterns, providing detailed information about common threats and mitigation techniques.
  • OCTAVE (Operationally Critical Threat, Asset, and Vulnerability Evaluation): A risk-based strategic assessment and planning technique for cybersecurity by the Carnegie Mellon University’s Software Engineering Institute.
  • OWASP Application Security Verification Standard (ASVS): A standard for designing, building, and testing secure applications.
  • OWASP Top Ten: Top Web Application Security Risks such as:
    • A01: Broken Access Control.
    • A02: Cryptographic Failures.
    • A03: Injection including Cross-site Scripting.
    • A04: Insecure Design.
    • A05: Security Misconfiguration.
    • A06: Vulnerable and Outdated Components.
    • A07: Identification and Broken Authentication Failures.
    • A08: Software and Data Integrity Failures including Insecure Deserialization.
    • A09: Security Logging and Monitoring Failures including various failures impacting visibility and incident response.
    • A10: Server-Side Request Forgery (SSRF).
  • CWE TOP 25 Most Dangerous Software Errors:
    • CWE-787: Out-of-bounds Write
    • CWE-79: Improper Neutralization of Input During Web Page Generation (‘Cross-site Scripting’)
    • CWE-89: Improper Neutralization of Special Elements used in an SQL Command (‘SQL Injection’)
    • CWE-416: Use After Free
    • CWE-78: Improper Neutralization of Special Elements used in an OS Command (‘OS Command Injection’)
    • CWE-20: Improper Input Validation
    • CWE-125: Out-of-bounds Read
    • CWE-22: Improper Limitation of a Pathname to a Restricted Directory (‘Path Traversal’)
    • CWE-352: Cross-Site Request Forgery (CSRF)
    • CWE-434: Unrestricted Upload of File with Dangerous Type
    • CWE-862: Missing Authorization
    • CWE-476: NULL Pointer Dereference
    • CWE-287: Improper Authentication
    • CWE-190: Integer Overflow or Wraparound
    • CWE-502: Deserialization of Untrusted Data
    • CWE-77: Improper Neutralization of Special Elements used in a Command (‘Command Injection’)
    • CWE-119: Improper Restriction of Operations within the Bounds of a Memory Buffer
    • CWE-798: Use of Hard-coded Credentials
    • CWE-918: Server-Side Request Forgery (SSRF)
    • CWE-306: Missing Authentication for Critical Function
    • CWE-362: Concurrent Execution using Shared Resource with Improper Synchronization (‘Race Condition’)
    • CWE-269: Improper Privilege Management
    • CWE-94: Improper Control of Generation of Code (‘Code Injection’)
    • CWE-863: Incorrect Authorization
    • CWE-276: Incorrect Default Permissions
  • SAMM (Software Assurance Maturity Model) A framework for analyzing software security practices.
  • CERT (Computer Emergency Response Team) Coding Standards: Standards for secure coding guidelines developed by the Software Engineering Institute (SEI) at Carnegie Mellon University.
  • SANS Secure Coding Practices: provides secure coding resources.
  • NIST (National Institute of Standards and Technology) Cybersecurity Framework: A comprehensive cybersecurity framework that includes guidelines for secure software development and risk management.
  • ISO/IEC 27034 (Application Security): An international standard provides guidance on information security for application services across their entire lifecycle, including design, development, testing, and maintenance.
  • BSIMM (Building Security In Maturity Model): A framework that helps organizations plan, execute, and measure their software security initiatives.
  • SAFECode (Software Assurance Forum for Excellence in Code): A non-profit organization for secure software development.
  • DREAD(Damage potential, Reproducibility, Exploitability, Affected users, and Discoverability: A model to rate and prioritize risks.

Security Incidents

Security incidents often result from inadequate security measures and oversight, highlighting the importance of rigorous security practices across various aspects of system management and software development.

  • Static Analysis of Source Code: The Heartbleed vulnerability in the OpenSSL cryptographic library allowed attackers to read sensitive memory contents and went undetected for over two years due to lack of static analysis and code reviews.
  • Scope of Privileges and Credentials: In Capital One Data Breach (2019), a former AWS employee exploited misconfigured web application firewall (WAF) credentials to gain unauthorized access to Capital One’s AWS environment, leading to the theft of over 100 million customer records that cost Capital One $270m.
  • Random numbers: Lack of secure random number generation, encryption algorithms and encryption configuration have caused numerous security breaches such as security deficiencies in IoT devices due to bad random number generation, factory-default passwords with security cameras, and attacking SSL with with RC4.
  • Data Classification: Improperly classifying and handling data based on its sensitivity and criticality have been source of security incidents like Equifax Data Breach (2017), which exposed personal information of over 143 million consumers and McDonald’s Data Leak (2017) that leaked personal information about 2.2 million users.
  • Secure Logging: Failure to implement secure logging led incidents like Apache Log4j Vulnerability (2021) that affected numerous applications and systems. Similarly, the lack of logging made it difficult to detect and investigate SolarWinds Supply Chain Attack (2020) that compromised numerous government agencies and companies.
  • Unauthorized Access to Production Data: Failing to implement appropriate controls and policies for governing production data use has led to significant data breaches. For example:
  • Filesystem Security: Failing to properly configure filesystem security led to critical issues such as Dirty Cow Vulnerability (2016) that caused privilege escalation vulnerability, and Shellshock Vulnerability (2014) that allowed remote code execution by exploiting vulnerabilities.
  • Memory protection with ASLR and KASLR: Failing to implement Address Space Layout Randomization (ASLR) and Kernel Address Space Layout Randomization (KASLR) led to the Linux Kernel Flaw (CVE-2024-0646), which exposed systems to privilege escalation attacks.
  • Code Signing: Failing to properly securely signing code with cryptographic digital signatures led to incidents like ASUS Live Update Hack (2019) that affected about 1 million users and Stuxnet Worm (2010) that targeted programmable logic controllers (PLCs).
  • Data Integrity: Failure to implement data integrity verification with cryptographic hashing, digital signatures, or checksums can lead to incidents like:
    • Petya Ransomware Attack (2017): The Petya ransomware, specifically the “NotPetya” variant employed advanced propagation methods, including leveraging legitimate Windows tools and exploiting known vulnerabilities like EternalBlue and EternalRomance.
    • Bangladesh Bank Cyber Heist (2016): Hackers compromised the bank’s systems and initiated fraudulent SWIFT transactions due to the lack of appropriate data integrity controls.
  • Data Privacy: implementing controls to protect data privacy using data minimization, anonymization, encryption, access controls, and compliance with GDPR/CCPA regulations can prevent incidents like:
  • Customer Workloads in Multi-tenant environments: Failing to implement proper security controls and isolation mechanisms when executing customer-provided workloads in a multi-tenant environment can lead to incidents like:
    • Azure Functions Vulnerability: Researchers discovered a vulnerability in Azure Functions that allows privilege escalation bug to potentially permitting an attacker to “plant a backdoor which would have run in every Function invocation”.
    • Docker Container Escape Vulnerability (2019): a vulnerability in runC was reported by its maintainers that affects Docker containers to gain root-level access on the host.
  • Certificate Revocation Validation: Verifying that the digital certificates used for authentication and encryption have not been revoked or compromised using a certificate revocation list (CRL) or using the Online Certificate Status Protocol (OCSP) can prevent incidents like:
  • Encryption and Key Rotation/Lengths: Failures in encryption and key management have led to significant security breaches:
  • Secure Configuration: Failure to implement secure configurations and changes and change management can lead to incidents like AWS S3 Bucket Misconfiguration (2017) where sensitive data from various organizations was exposed due to misconfigured AWS S3 bucket permissions.
  • Secure communication protocols: Failure to implement secure communication protocols, such as TLS/SSL, to protect data in transit and mitigate man- in-the-middle attacks can lead to incidents like:
    • POODLE Attack (2014) exploited a vulnerability in the way SSL 3.0 handled padding, allowing attackers to decrypt encrypted connections.
    • FREAK Attack (2015) exploited a vulnerability in legacy export-grade encryption to allow attackers to perform man-in-the-middle attacks.
  • Secure Authentication: Failure to implement secure authentication mechanisms, such as multi-factor authentication (MFA) and strong password policies can lead to unauthorized access like:
  • Secure Backup and Disaster Recovery: Failure to implement secure procedures for data backup and recovery, including encryption, access controls, and offsite storage, can lead to incidents such as:
    • Code Spaces Data Loss (2014): The Code Spaces was forced to shut down after a catastrophic data loss incident due to a lack of secure backup and recovery procedures.
    • Garmin Ransomware Attack (2020): Garmin was hit by a ransomware attack that disrupted its services and operations, highlighting the importance of secure data backup and recovery procedures.
  • Secure Caching: Implementing proper authentication, access controls, and encryption prevent data leaks or unauthorized access like:
    • Cloudflare Data Leak (2017): A vulnerability in Cloudflare’s cache servers resulted in sensitive data leaking across different customer websites, exposing sensitive information.
    • Memcached DDoS Attacks (2018): Misconfigured Memcached servers were exploited by attackers to launch massive distributed denial-of-service (DDoS) attacks.
  • Privilege Escalation (Least Privilege): Improper privilege management caused Edward Snowden Data Leaks (2013) which allowed Snowden to copy and exfiltrate sensitive data from classified systems. In Capital One Data Breach (2019) breach, an overly permissive IAM policy granted broader access than necessary, violating the principle of least privilege. In addition, a contingent authorization can be granted for temporary or limited access to resources or systems based on specific conditions or events.
  • SPF, DKIM, DMARC: implement the email authentication such as SPF (Sender Policy Framework), DKIM (DomainKeys Identified Mail) and DMARC (Domain-based Message Authentication, Reporting, and Conformance), and anti-spoofing mechanisms for all domains.
  • Multitenancy: Implement secure and isolated processing of service requests in a multi-tenant environment to prevent unauthorized access or data leakage between different tenants like:
  • Identity Management in Mobile applications: Insecure authentication, authorization, and user management mechanisms can lead to incidents like:
  • Secure Default Configuration: The systems and applications should be designed and configured to be secure by default to prevent incidents like:
  • Server-side Template Injection (SSTI): A vulnerability that occurs when user-supplied input is improperly interpreted as part of a server-side template engine, leading to the potential execution of arbitrary code.
    • SSTI in Apache Freemarker (2022): A SSTI vulnerability in the Apache Freemarker templating engine allowed remote code execution in various applications.
    • SSTI in Jinja2: Illustrates how Server-Side Template Injection (SSTI) payloads for Jinja2 can enable remote code execution.
  • Reverse Tabnabbing: A security vulnerability that occurs when a website you trust opens a link in a new tab and an attacker manipulates the website contents with malicious contents.
  • Regions and Partitions Isolation: Isolating the security and controls for each region and partition helps prevent security vulnerabilitiessuch as:
    • AWS US-East-1 Outage (2017): An operational issue in AWS’s US-East-1 region caused widespread service disruptions, affecting numerous customers and services hosted in that region.
    • Google Cloud Engine Outage (2016): A software bug in Google’s central data center caused cascading failures and service disruptions across multiple regions.
  • External Dependencies: Regularly reviewing and assessing external (Software-Defined Object) dependencies for potential security vulnerabilities can mitigate supply chain attacks and security breaches like:
    • Equifax Data Breach (2017): The breach was caused by the failure to patch a vulnerability in the Apache Struts open-source framework used by Equifax.
    • Log4Shell Vulnerability (2021): A critical vulnerability in the Log4j library, used for logging in Java applications, allowed attackers to execute arbitrary code on affected systems.
  • Circular Dependencies: Avoiding circular dependencies in software design can prevent incidents like:
    • Node.js Event-Stream Incident (2018): A malicious actor gained control of the popular “event-stream” package and injected malicious code.
    • Left-Pad Incident (2016): Although not a direct security breach, the removal of the “left-pad” npm package broke thousands of projects due to its circular dependencies.
    • Windows DLL Hijacking: Complex dependency management can lead to DLL hijacking that can execute malicious code.
  • Confused Deputy: The “Confused Deputy” problem, which occurs when a program inadvertently performs privileged operations on behalf of another entity, leading to security breaches:
    • Google Docs Phishing Attack (2017): Attackers exploited a feature in Google Docs to trick users into granting permission to a malicious app disguised as Google Docs.
    • Android Toast Overlay Attack (2017): A vulnerability in the Android operating system allowed malicious apps to display overlay Toast messages that could intercept user input or perform actions without user consent.
  • Validation Before Deserialization: Failure to validate the deserialized data can lead to security vulnerabilities, such as code execution or data tampering attacks like:
  • Generic Error Messages: implement proper error handling and return generic error messages rather than exposing sensitive information or implementation details.
  • Monitoring: Failure to implement proper logging and monitoring mechanisms can make it difficult to detect and respond to security incidents.
    • Uber’s Data Breach (2016): Uber failed to properly monitor and respond to security alerts, resulting in a delayed discovery of the data breach that exposed data of 57 million users and drivers.
    • Target Data Breach (2013): Inadequate logging and monitoring allowed attackers to remain undetected in Target’s systems for several weeks, resulting in the theft of millions of credit card records.
  • Secure Web Design: Implement input validation, secure session management, cross-site scripting (XSS) prevention, cross-site request forgery (CSRF) protection, and industry best practices to prevent incidents like:

Summary

Microservice architectures offer scalability, agility, and resilience but also present unique security challenges. Addressing these challenges requires adhering to the principles of confidentiality, integrity, and availability (CIA). Key security practices include strong identity management, defense in depth, principle of least privilege, zero trust security, comprehensive auditing and monitoring, and protecting data in motion and at rest. Security methodologies and frameworks like DevSecOps, Security by Design (SbD), and threat modeling techniques (e.g., STRIDE, PASTA) ensure robust security integration throughout the development lifecycle. Real-world incidents highlight the consequences of inadequate security measures. Implementing secure communication protocols, authentication mechanisms, and data backup procedures are crucial. Overall, a proactive and comprehensive approach to security, incorporating established practices and frameworks, is vital for safeguarding microservice architectures and distributed systems.

January 1, 2023

Consumer-driven and Producer-generated Contract Testing for REST APIs

Filed under: REST,Testing,Web Services — admin @ 9:43 pm

Though, REST standard for remote APIs is fairly loose but you can document API shape and structure using standards such as Open API and swagger specifications. The documented API specification ensures that both consumer/client and producer/server side abide by the specifications and prevent unexpected behavior. The API provider may also define service-level objective (SLO) so that API meets specified latency, security and availability and other service-level indicators (SLI). The API provider can use contract tests to validate the API interactions based on documented specifications. The contract testing includes both consumer and producer where a consumer makes an API request and the producer produces the result. The contract tests ensures that both consumer requests and producer responses match the contract request and response definitions per API specifications. These contract tests don’t just validate API schema instead they validate interactions between consumer and producer thus they can also be used to detect any breaking or backward incompatible changes so that consumers can continue using the APIs without any surprises.

In order to demonstrate contract testing, we will use api-mock-service library to generate mock/stub client requests and server responses based on Open API specifications or customized test contracts. These test contracts can be used by both consumers and producers for validating API contracts and evolve the contract tests as API specifications are updated.

Sample REST API Under Test

A sample eCommerce application will be used to demonstrate contracts testing. The application will use various REST APIs to implement online shopping experience. The primary purpose of this example is to show how different request structures can be passed to the REST APIs and then generate a valid result or an error condition for contract testing. You can view the Open-API specifications for this sample app here.

Customer REST APIs

The customer APIs define operations to manage customers who shop online, e.g.:

Customer APIs

Product REST APIs

The product APIs define operations to manage products that can be shopped online, e.g.:

Product APIs

Payment REST APIs

The payment APIs define operations to charge credit card and pay for online shopping, e.g.:

Payment APIs

Order REST APIs

The order APIs define operations to purchase a product from the online store and it will use above APIs to validate customers, check product inventory, charge payments and then store record of orders, e.g.:

Order APIs

Generating Stub Server Responses based on Open-API Specifications

In this example, stub server responses will be generated by api-mock-service based on open-api specifications ecommerce-api.json by starting the mock service first as follows:

docker pull plexobject/api-mock-service:latest
docker run -p 8000:8000 -p 9000:9000 -e HTTP_PORT=8000 -e PROXY_PORT=9000 \
	-e DATA_DIR=/tmp/mocks -e ASSET_DIR=/tmp/assets api-mock-service

And then uploading open-API specifications for ecommerce-api.json:

curl -H "Content-Type: application/yaml" --data-binary @ecommerce-api.json \
	http://localhost:8000/_oapi

It will generate test contracts with stub/mock responses for all APIs defined in the ecommerce-api.json Open API specification. For example, you can produce result of customers REST APIs, e.g.:

curl http://localhost:8000/customers

to produce:

[
  {
    "address": {
      "city": "PpCJyfKUomUOdhtxr",
      "countryCode": "US",
      "id": "ede97f59-2ef2-48e5-913f-4bce0f152603",
      "streetAddress": "Se somnis cibo oculi, die flammam petimus?",
      "zipCode": "06826"
    },
    "creditCard": {
      "balance": {
        "amount": 53965,
        "currency": "CAD"
      },
      "cardNumber": "7345-4444-5461",
      "customerId": "WB97W4L2VQRRkH5L0OAZGk0MT957r7Z",
      "expiration": "25/0000",
      "id": "ae906a78-0aff-4d4e-ad80-b77877f0226c",
      "type": "VISA"
    },
    "email": "abigail.appetitum@dicant.net",
    "firstName": "sciam",
    "id": "21c82838-507a-4745-bc1b-40e6e476a1fb",
    "lastName": "inquit",
    "phone": "1-717-5555-3010"
  },
...  

Above response is randomly generated based on the types/formats/regex/min-max limits of properties defined in Open-API and calling this API will automatically generate all valid and error responses, e.g. calling “curl http://localhost:8000/customers” again will return:

* Mark bundle as not supporting multiuse
< HTTP/1.1 500 Internal Server Error
< Content-Type:
< Vary: Origin
< X-Mock-Path: /customers
< X-Mock-Request-Count: 9
< X-Mock-Scenario: getCustomerByEmail-customers-500-8a93b6c60c492e730ea149d5d09e79d85701c01dbc017d178557ed1d2c1bad3d
< Date: Sun, 01 Jan 2023 20:41:17 GMT
< Content-Length: 67
<
* Connection #0 to host localhost left intact
{"logRef":"achieve_output_fresh","message":"buffalo_rescue_street"}

Consumer-driven Contract Testing

Upon uploading the Open-API specifications of microservices, the api-mock-service generates test contracts for each REST API and response statuses. You can then customize these test cases for consumer-driven contract testing.

For example, here is the default test contract generated for finding a customer by id with path “/customers/:id”:

method: GET
name: getCustomer-customers-200-61a298e
path: /customers/:id
description: ""
predicate: ""
request:
    match_query_params: {}
    match_headers: {}
    match_contents: '{}'
    path_params:
        id: \w+
    query_params: {}
    headers: {}
response:
    headers: 
      Content-Type:
        - application/json
    contents: '{"address":{"city":"{{RandStringMinMax 2 60}}","countryCode":"{{EnumString `US CA`}}","id":"{{UUID}}","streetAddress":"{{RandRegex `\\w+`}}","zipCode":"{{RandRegex `\\d{5}`}}"},"creditCard":{"balance":{"amount":{{RandNumMinMax 0 0}},"currency":"{{RandRegex `(USD|CAD|EUR|AUD)`}}"},"cardNumber":"{{RandRegex `\\d{4}-\\d{4}-\\d{4}`}}","customerId":"{{RandStringMinMax 30 36}}","expiration":"{{RandRegex `\\d{2}/\\d{4}`}}","id":"{{UUID}}","type":"{{EnumString `VISA MASTERCARD AMEX`}}"},"email":"{{RandRegex `.+@.+\\..+`}}","firstName":"{{RandRegex `\\w`}}","id":"{{UUID}}","lastName":"{{RandRegex `\\w`}}","phone":"{{RandRegex `1-\\d{3}-\\d{4}-\\d{4}`}}"}'
    contents_file: ""
    status_code: 200
wait_before_reply: 0s

Above template demonstrates interaction between consumer and producer by defining properties such as:

  • method – of REST API such as GET/POST/PUT/DELETE
  • name – of the test case
  • path of REST API
  • description – of test
  • predicate – defines a condition which must be true to select this test contract
  • request section defines input properties for the REST API including:
    • match_query_params – to match query input parameters for selecting the test contract
    • match_headers – to match input headers for selecting the test contract
    • match_contents – defines regex for selecting input body
    • path_params – defines path variables and regex
    • query_params and headers – defines sample input parameters and headers
  • response section defines output properties for the REST API including:
    • headers – defines response headers
    • contents – defines body of response
    • contents_file – allows loading response from a file
    • status_code – defines HTTP response status
  • wait_before_reply – defines wait time before returning response

You can then invoke test contract using:

curl http://localhost:8000/customers/1

that generates test case from the mock/stub server provided by the api-mock-service library, e.g.

{
  "address": {
    "city": "PanHQyfbHZVw",
    "countryCode": "US",
    "id": "ff5d0e98-daa5-49c8-bb79-f2d7274f2fb1",
    "streetAddress": "Sumus o proferens etiamne intuerer fugasti, nuntiantibus da?",
    "zipCode": "01364"
  },
  "creditCard": {
    "balance": {
      "amount": 80704,
      "currency": "USD"
    },
    "cardNumber": "3226-6666-2214",
    "customerId": "0VNf07XNWkLiIBhfmfCnrE1weTlkhmxn",
    "expiration": "24/5555",
    "id": "f9549ef3-a5eb-4df4-a8a9-85a30a6a49c6",
    "type": "VISA"
  },
  "email": "amanda.doleat@fructu.com",
  "firstName": "quaero",
  "id": "9aeee733-932d-4244-a6f8-f21d2883fd27",
  "lastName": "habeat",
  "phone": "1-052-5555-4733"
}

You can customize above response contents using builtin template functions in the api-mock-service library or create additional test contracts for each distinct input parameter. For example, following contract defines interaction between consumer and producer to add a new customer:

method: POST
name: saveCustomer-customers-200-ddfceb2
path: /customers
description: ""
order: 0
group: Sample Ecommerce API
predicate: ""
request:
    match_query_params: {}
    match_headers: {}
    match_contents: '{"address.city":"(__string__\\w+)","address.countryCode":"(__string__(US|CA))","address.streetAddress":"(__string__\\w+)","address.zipCode":"(__string__\\d{5})","creditCard.balance.amount":"(__number__[+-]?((\\d{1,10}(\\.\\d{1,5})?)|(\\.\\d{1,10})))","creditCard.balance.currency":"(__string__(USD|CAD|EUR|AUD))","creditCard.cardNumber":"(__string__\\d{4}-\\d{4}-\\d{4})","creditCard.customerId":"(__string__\\w+)","creditCard.expiration":"(__string__\\d{2}/\\d{4})","creditCard.type":"(__string__(VISA|MASTERCARD|AMEX))","email":"(__string__.+@.+\\..+)","firstName":"(__string__\\w)","lastName":"(__string__\\w)","phone":"(__string__1-\\d{3}-\\d{4}-\\d{4})"}'
    path_params: {}
    query_params: {}
    headers:
        ContentsType: application/json
    contents: '{"address":{"city":"__string__\\w+","countryCode":"__string__(US|CA)","streetAddress":"__string__\\w+","zipCode":"__string__\\d{5}"},"creditCard":{"balance":{"amount":"__number__[+-]?((\\d{1,10}(\\.\\d{1,5})?)|(\\.\\d{1,10}))","currency":"__string__(USD|CAD|EUR|AUD)"},"cardNumber":"__string__\\d{4}-\\d{4}-\\d{4}","customerId":"__string__\\w+","expiration":"__string__\\d{2}/\\d{4}","type":"__string__(VISA|MASTERCARD|AMEX)"},"email":"__string__.+@.+\\..+","firstName":"__string__\\w","lastName":"__string__\\w","phone":"__string__1-\\d{3}-\\d{4}-\\d{4}"}'
    example_contents: |
        address:
            city: Ab fabrorum meminerim conterritus nota falsissime deum?
            countryCode: CA
            streetAddress: Mei nisi dum, ab amaremus antris?
            zipCode: "00128"
        creditCard:
            balance:
                amount: 3000.4861560368768
                currency: USD
            cardNumber: 7740-7777-6114
            customerId: Fudi eodem sed habitaret agam pro si?
            expiration: 85/2222
            type: AMEX
        email: larry.neglecta@audio.edu
        firstName: fatemur
        lastName: gaudeant
        phone: 1-543-8888-2641
response:
    headers: 
      Content-Type: 
        - application/json
    contents: '{"address":{"city":"{{RandStringMinMax 2 60}}","countryCode":"{{EnumString `US CA`}}","id":"{{UUID}}","streetAddress":"{{RandRegex `\\w+`}}","zipCode":"{{RandRegex `\\d{5}`}}"},"creditCard":{"balance":{"amount":{{RandNumMinMax 0 0}},"currency":"{{RandRegex `(USD|CAD|EUR|AUD)`}}"},"cardNumber":"{{RandRegex `\\d{4}-\\d{4}-\\d{4}`}}","customerId":"{{RandStringMinMax 30 36}}","expiration":"{{RandRegex `\\d{2}/\\d{4}`}}","id":"{{UUID}}","type":"{{EnumString `VISA MASTERCARD AMEX`}}"},"email":"{{RandRegex `.+@.+\\..+`}}","firstName":"{{RandRegex `\\w`}}","id":"{{UUID}}","lastName":"{{RandRegex `\\w`}}","phone":"{{RandRegex `1-\\d{3}-\\d{4}-\\d{4}`}}"}'
    contents_file: ""
    status_code: 200
wait_before_reply: 0s

Above template defines interaction for adding a new customer where request section defines format of request and matching criteria using match_content property. The response section includes the headers and contents that are generated by the stub/mock server for consumer-driven contract testing. You can then invoke test contract using:

curl -X POST http://localhost:8000/customers -d '{"address":{"city":"rwjJS","countryCode":"US","id":"4a788c96-e532-4a97-9b8b-bcb298636bc1","streetAddress":"Cura diu me, miserere me?","zipCode":"24121"},"creditCard":{"balance":{"amount":57012,"currency":"USD"},"cardNumber":"5566-2222-8282","customerId":"tgzwgThaiZqc5eDwbKk23nwjZqkap7","expiration":"70/6666","id":"d966aafa-c28b-4078-9e87-f7e9d76dd848","type":"VISA"},"email":"andrew.recorder@ipsas.net","firstName":"quendam","id":"071396bb-f8db-489d-a8f7-bbcce952ecef","lastName":"formaeque","phone":"1-345-6666-0618"}'

Which will return a response such as:

{
  "address": {
    "city": "j77oUSSoB5lJCUtc4scxtm0vhilPRdLE7Nc8KzAunBa87OrMerCZI",
    "countryCode": "CA",
    "id": "9bb21030-29d0-44be-8f5a-25855e38c164",
    "streetAddress": "Qui superbam imago cernimus, sensarum nuntii tot da?",
    "zipCode": "08020"
  },
  "creditCard": {
    "balance": {
      "amount": 75666,
      "currency": "AUD"
    },
    "cardNumber": "1383-8888-5013",
    "customerId": "nNaUd15lf6lqkAEwKoguVTvBnPMBVDhdeO",
    "expiration": "73/5555",
    "id": "554efad7-17ab-49f9-967a-3e47381a4d34",
    "type": "AMEX"
  },
  "email": "deborah.vivit@desivero.gov",
  "firstName": "contexo",
  "id": "db70b737-ee1d-48ed-83da-c5a8773c7a5f",
  "lastName": "delectat",
  "phone": "1-013-7777-0054"
}

Note: The response will not match the request body as the contract testing only tests interactions between consumer and producer without maintaining any server side state. You can use other types of testing such as integration/component/functional testing for validating state based behavior.

Producer-driven Generated Tests

The process of defining contracts to generate tests for validating producer REST APIs is similar to consumer-driven contracts. For example, you can upload open-api specifications or user-defined contracts to the api-mock-service provided mock/stub server.

For example, you can upload open-API specifications for ecommerce-api.json as follows:

curl -H "Content-Type: application/yaml" --data-binary @ecommerce-api.json \
	http://localhost:8000/_oapi

Upon uploading the specifications, the mock server will generate contracts for each REST API and status. You can customize those contracts with additional validation or assertion and then invoke server generated tests either by specifying the REST API or invoke multiple REST APIs belonging to a specific group. You can also define an order for executing tests in a group and can optionally pass data from one invocation to the next invocation of REST API.

For testing purpose, we will customize customer REST APIs for adding a new customer and fetching a customer by its id, i.e.,

A contract for adding a new customer

method: POST
name: save-customer
path: /customers
group: customers
order: 0
request:
    headers:
        Content-Type: application/json
    contents: |
        address:
            city: {{RandCity}}
            countryCode: {{EnumString `US CA`}}
            id: {{UUID}}
            streetAddress: {{RandSentence 2 3}}
            zipCode: {{RandRegex `\d{5}`}}
        creditCard:
            balance:
                amount: {{RandNumMinMax 20 500}}
                currency: {{EnumString `USD CAD`}}
            cardNumber: {{RandRegex `\d{4}-\d{4}-\d{4}`}}
            customerId: {{UUID}}
            expiration: {{RandRegex `\d{2}/\d{4}`}}
            id: {{UUID}}
            type: {{EnumString `VISA MASTERCARD`}}
        email: {{RandEmail}}
        firstName: {{RandName}}
        id: {{UUID}}
        lastName: {{RandName}}
        phone: {{RandRegex `1-\d{3}-\d{3}-\d{4}`}}
response:
    match_headers: {}
    match_contents: '{"address.city":"(__string__\\w+)","address.countryCode":"(__string__(US|CA))","address.id":"(__string__\\w+)","address.streetAddress":"(__string__\\w+)","address.zipCode":"(__string__\\d{5}.?\\d{0,4})","creditCard.balance.amount":"(__number__[+-]?(([0-9]{1,10}(\\.[0-9]{1,5})?)|(\\.[0-9]{1,10})))","creditCard.balance.currency":"(__string__\\w+)","creditCard.cardNumber":"(__string__[\\d-]{10,20})","creditCard.customerId":"(__string__\\w+)","creditCard.expiration":"(__string__\\d{2}.\\d{4})","creditCard.id":"(__string__\\w+)","creditCard.type":"(__string__(VISA|MASTERCARD|AMEX))","email":"(__string__.+@.+\\..+)","firstName":"(__string__\\w+)","id":"(__string__\\w+)","lastName":"(__string__\\w+)","phone":"(__string__[\\-\\w\\d]{9,15})"}'
    pipe_properties:
      - id
      - email
    assertions:
      - VariableContains contents.email @
      - VariableContains contents.creditCard.type A
      - VariableContains headers.Content-Type application/json
      - VariableEQ status 200

The request section defines content property that will build the input request, which will be sent to the producer provided REST API. The server section defines match_contents to match regex of each response property. In addition, the response section defines assertions to compare against response contents, headers or status against expected output.

A contract for finding an existing customer

method: GET
name: get-customer
path: /customers/{{.id}}
description: ""
order: 1
group: customers
predicate: ""
request:
    path_params:
        id: \w+
    query_params: {}
    headers:
      Content-Type: application/json
    contents: ""
    example_contents: ""
response:
    headers: {}
    match_headers:
      Content-Type: application/json    
    match_contents: '{"address.city":"(__string__\\w+)","address.countryCode":"(__string__(US|CA))","address.streetAddress":"(__string__\\w+)","address.zipCode":"(__string__\\d{5})","creditCard.balance.amount":"(__number__[+-]?((\\d{1,10}(\\.\\d{1,5})?)|(\\.\\d{1,10})))","creditCard.balance.currency":"(__string__(USD|CAD|EUR|AUD))","creditCard.cardNumber":"(__string__\\d{4}-\\d{4}-\\d{4})","creditCard.customerId":"(__string__\\w+)","creditCard.expiration":"(__string__\\d{2}/\\d{4})","creditCard.type":"(__string__(VISA|MASTERCARD|AMEX))","email":"(__string__.+@.+\\..+)","firstName":"(__string__\\w)","lastName":"(__string__\\w)","phone":"(__string__1-\\d{3}-\\d{3}-\\d{4})"}'
    pipe_properties:
      - id
      - email
    assertions:
      - VariableContains contents.email @
      - VariableContains contents.creditCard.type A
      - VariableContains headers.Content-Type application/json
      - VariableEQ status 200

Above template defines similar properties to generate request body and defines match_contents with assertions to match expected output headers, body and status. Based on order of tests, the generated test to add new customer will be executed first, which will be followed by the test to find a customer by id. As we are testing against real REST APIs, the REST API path is defined as “/customers/{{.id}}” for finding a customer will populate the id from the output of first test based on the pipe_properties.

Uploading Contracts

Once you have the api-mock-service mock server running, you can upload contracts using:

curl -H "Content-Type: application/yaml" --data-binary @fixtures/get_customer.yaml \
	http://localhost:8000/_scenarios
curl -H "Content-Type: application/yaml" --data-binary @fixtures/save_customer.yaml \
	http://localhost:8000/_scenarios

You can start your service before invoking generated tests, e.g. we will use sample-openapi for the testing purpose and then invoke the generated tests using:

curl -X POST http://localhost:8000/_contracts/customers -d \
	'{"base_url": "http://localhost:8080", "execution_times": 5, "verbose": true}'

Above command will execute all tests for customers group and it will invoke each REST API 5 times. After executing the APIs, it will generate result as follows:

{
  "results": {
    "get-customer_0": {
      "email": "anna.intra@amicum.edu",
      "id": "fa7a06cd-1bf1-442e-b761-d1d074d24373"
    },
    "get-customer_1": {
      "email": "aaron.sequi@laetus.gov",
      "id": "c5128ac0-865c-4d91-bb0a-23940ac8a7cb"
    },
    "get-customer_2": {
      "email": "edward.infligi@evellere.com",
      "id": "a485739f-01d4-442e-9ddc-c2656ba48c63"
    },
    "get-customer_3": {
      "email": "gary.volebant@istae.com",
      "id": "ef0eacd0-75cc-484f-b9a4-7aebfe51d199"
    },
    "get-customer_4": {
      "email": "alexis.dicant@displiceo.net",
      "id": "da65b914-c34e-453b-8ee9-7f0df598ac13"
    },
    "save-customer_0": {
      "email": "anna.intra@amicum.edu",
      "id": "fa7a06cd-1bf1-442e-b761-d1d074d24373"
    },
    "save-customer_1": {
      "email": "aaron.sequi@laetus.gov",
      "id": "c5128ac0-865c-4d91-bb0a-23940ac8a7cb"
    },
    "save-customer_2": {
      "email": "edward.infligi@evellere.com",
      "id": "a485739f-01d4-442e-9ddc-c2656ba48c63"
    },
    "save-customer_3": {
      "email": "gary.volebant@istae.com",
      "id": "ef0eacd0-75cc-484f-b9a4-7aebfe51d199"
    },
    "save-customer_4": {
      "email": "alexis.dicant@displiceo.net",
      "id": "da65b914-c34e-453b-8ee9-7f0df598ac13"
    }
  },
  "errors": {},
  "metrics": {
    "getcustomer_counts": 5,
    "getcustomer_duration_seconds": 0.006,
    "savecustomer_counts": 5,
    "savecustomer_duration_seconds": 0.006
  },
  "succeeded": 10,
  "failed": 0
}

Though, generated tests are executed against real services, it’s recommended that the service implementation use test doubles or mock services for any dependent services as contract testing is not meant to replace component or end-to-end tests that provide better support for integration testing.

Recording Consumer/Producer interactions for Generating Stub Requests and Responses

The contract testing does not always depend on API specifications such as Open API and swagger and instead you can record interactions between consumers and producers using api-mock-service tool.

For example, if you have an existing REST API or a legacy service such as above sample API, you can record an interaction as follows:

export http_proxy="http://localhost:9000"
export https_proxy="http://localhost:9000"
curl -X POST -H "Content-Type: application/json" http://localhost:8080/customers -d \
	'{"address":{"city":"rwjJS","countryCode":"US","id":"4a788c96-e532-4a97-9b8b-bcb298636bc1","streetAddress":"Cura diu me, miserere me?","zipCode":"24121"},"creditCard":{"balance":{"amount":57012,"currency":"USD"},"cardNumber":"5566-2222-8282","customerId":"tgzwgThaiZqc5eDwbKk23nwjZqkap7","expiration":"70/6666","id":"d966aafa-c28b-4078-9e87-f7e9d76dd848","type":"VISA"},"email":"andrew.recorder@ipsas.net","firstName":"quendam","id":"071396bb-f8db-489d-a8f7-bbcce952ecef","lastName":"formaeque","phone":"1-345-6666-0618"}'

This will invoke the remote REST API, record contract interactions and then return server response:

{
  "id": "95d655e1-405e-4087-8a7d-56791eaf51cc",
  "firstName": "quendam",
  "lastName": "formaeque",
  "email": "andrew.recorder@ipsas.net",
  "phone": "1-345-6666-0618",
  "creditCard": {
    "id": "d966aafa-c28b-4078-9e87-f7e9d76dd848",
    "customerId": "tgzwgThaiZqc5eDwbKk23nwjZqkap7",
    "type": "VISA",
    "cardNumber": "5566-2222-8282",
    "expiration": "70/6666",
    "balance": {
      "amount": 57012,
      "currency": "USD"
    }
  },
  "address": {
    "id": "4a788c96-e532-4a97-9b8b-bcb298636bc1",
    "streetAddress": "Cura diu me, miserere me?",
    "city": "rwjJS",
    "zipCode": "24121",
    "countryCode": "US"
  }
}

The recorded contract can be used to generate the stub response, e.g. following configuration defines the recorded contract:

method: POST
name: recorded-customers-200-55240a69747cac85a881a3ab1841b09c2c66d6a9a9ae41c99665177d3e3b5bb7
path: /customers
description: recorded at 2023-01-02 03:18:11.80293 +0000 UTC for http://localhost:8080/customers
order: 0
group: customers
predicate: ""
request:
    match_query_params: {}
    match_headers:
        Content-Type: application/json
    match_contents: '{"address.city":"(__string__\\w+)","address.countryCode":"(__string__\\w+)","address.id":"(.+)","address.streetAddress":"(__string__\\w+)","address.zipCode":"(__string__\\d{5,5})","creditCard.balance.amount":"(__number__[+-]?\\d{1,10})","creditCard.balance.currency":"(__string__\\w+)","creditCard.cardNumber":"(__string__\\d{4,4}[-]\\d{4,4}[-]\\d{4,4})","creditCard.customerId":"(.+)","creditCard.expiration":"(.+)","creditCard.id":"(.+)","creditCard.type":"(__string__\\w+)","email":"(__string__\\w+.?\\w+@\\w+.?\\w+)","firstName":"(__string__\\w+)","id":"(.+)","lastName":"(__string__\\w+)","phone":"(__string__\\d{1,1}[-]\\d{3,3}[-]\\d{4,4}[-]\\d{4,4})"}'
    path_params: {}
    query_params: {}
    headers:
        Accept: '*/*'
        Content-Length: "522"
        Content-Type: application/json
        User-Agent: curl/7.65.2
    contents: '{"address":{"city":"rwjJS","countryCode":"US","id":"4a788c96-e532-4a97-9b8b-bcb298636bc1","streetAddress":"Cura diu me, miserere me?","zipCode":"24121"},"creditCard":{"balance":{"amount":57012,"currency":"USD"},"cardNumber":"5566-2222-8282","customerId":"tgzwgThaiZqc5eDwbKk23nwjZqkap7","expiration":"70/6666","id":"d966aafa-c28b-4078-9e87-f7e9d76dd848","type":"VISA"},"email":"andrew.recorder@ipsas.net","firstName":"quendam","id":"071396bb-f8db-489d-a8f7-bbcce952ecef","lastName":"formaeque","phone":"1-345-6666-0618"}'
    example_contents: ""
response:
    headers:
        Content-Type:
            - application/json
        Date:
            - Mon, 02 Jan 2023 03:18:11 GMT
    contents: '{"id":"95d655e1-405e-4087-8a7d-56791eaf51cc","firstName":"quendam","lastName":"formaeque","email":"andrew.recorder@ipsas.net","phone":"1-345-6666-0618","creditCard":{"id":"d966aafa-c28b-4078-9e87-f7e9d76dd848","customerId":"tgzwgThaiZqc5eDwbKk23nwjZqkap7","type":"VISA","cardNumber":"5566-2222-8282","expiration":"70/6666","balance":{"amount":57012.00,"currency":"USD"}},"address":{"id":"4a788c96-e532-4a97-9b8b-bcb298636bc1","streetAddress":"Cura diu me, miserere me?","city":"rwjJS","zipCode":"24121","countryCode":"US"}}'
    contents_file: ""
    example_contents: ""
    status_code: 200
    match_headers: {}
    match_contents: '{"address.city":"(__string__\\w+)","address.countryCode":"(__string__\\w+)","address.id":"(.+)","address.streetAddress":"(__string__\\w+)","address.zipCode":"(__string__\\d{5,5})","creditCard.balance.amount":"(__number__[+-]?\\d{1,10})","creditCard.balance.currency":"(__string__\\w+)","creditCard.cardNumber":"(__string__\\d{4,4}[-]\\d{4,4}[-]\\d{4,4})","creditCard.customerId":"(.+)","creditCard.expiration":"(.+)","creditCard.id":"(.+)","creditCard.type":"(__string__\\w+)","email":"(__string__\\w+.?\\w+@\\w+.?\\w+)","firstName":"(__string__\\w+)","id":"(.+)","lastName":"(__string__\\w+)","phone":"(__string__\\d{1,1}[-]\\d{3,3}[-]\\d{4,4}[-]\\d{4,4})"}'
    pipe_properties: []
    assertions: []
wait_before_reply: 0s

You can then invoke consumer-driven contracts to generate stub response or invoke generated tests to test against producer implementation as described in earlier section. Another benefit of capturing test contracts using recorded session is that it can accurately capture all URLs, parameters and headers for both requests and responses so that contract testing can precisely validate against existing behavior.

Summary

Though, unit-testing, component testing and end-to-end testing are a common testing strategies that are used by most organizations but they don’t provide adequate support to validate API specifications and interactions between consumers/clients and producers/providers of the APIs. The contract testing ensures that consumers and producers will not deviate from the specifications and can be used to validate changes for backward compatibility when APIs are evolved. This also decouples consumers and producers if the API is still in development as both parties can write code against the agreed contracts and test them independently. A service owner can generate producer contracts using tools such as api-mock-service based on Open API specification or user-defined constraints. The consumers can provide their consumer-driven contracts to the service providers to ensure that the API changes don’t break any consumers. These contracts can be stored in a source code repository or on a registry service so that contract testing can easily access them and execute them as part of the build and deployment pipelines. The api-mock-service tool greatly assists in adding contract testing to your software development lifecycle and is freely available from https://github.com/bhatti/api-mock-service.

February 6, 2016

Building a Generic Data Service

Filed under: Web Services — admin @ 10:44 pm

As REST based Micro-Services have become prevalent, I often find that web and mobile clients have to connect to different services for gathering data. You may have to call dozens of services to display data on a single screen or page. Also, you may only need subset of data from each service but you still have to pay for the bandwidth and parsing cost.

I created a new Java framework PlexDataProviders for aggregating and querying data from various underlying sources, which can be used to build a general-purpose data service. PlexDataProviders is a light-weight Java framework that abstract access to various data providers such as databases, files, web services, etc. It allows aggregation of data from various data providers.

The PlexDataProviders framework is divided into two components:

  • Data Provider – This component defines interfaces that are implemented to access data sources such as database or web services.
  • Query Engine – This component is used for querying and aggregating data.

The query engine can determine dependency between providers and it also allow you to use output of one of the data provider as input to another data provider. For example, let’s assume:

  • data-provider A requires input-a1, input-a2 and produces output-a1, output-a2
  • data-provider B requires input-b1 and output-a1 and produces output-b1, output-b2

Then you can pass input-a1, input-a2 to the query engine and request output-a1, output-a2, output-b1, output-b2 output data fields.

Benefits

PlexDataProviders provides offers following benefits:

  • It provides a unified way to search data and abstracts integration to underlying data sources.
  • It helps simplifying client side logic as they can use a single data service to query all data instead of using multiple data services.
  • This also help with managing end-points as you only a single end-point instead of connecting to multiple web services.
  • As clients can specify the data they need, this helps with payload size and network bandwidth.
  • The clients only need to create a single data parser so it keeps JSON parsing logic simple.
  • As PlexDataProviders supports multi-threading, it also helps with latency of the data fetch requests.
  • It partial failure so that a failure in a single data provider doesn’t effect other data providers and the data service can still return partial results. User
  • It supports timeout so that clients can receive available data that completes in given timeout interval

Data Structure

Following are primary data structures:

  • MetaField – This class defines meta information for each data field such as name, kind, type, etc.
  • MetaFieldType – This enum class supports primitive data types supported, i.e.
    • SCALAR_TEXT – simple text
    • SCALAR_INTEGER – integer numbers
    • SCALAR_DECIMAL – decimal numbers
    • SCALAR_DATE – dates
    • SCALAR_BOOLEAN – boolean
    • VECTOR_TEXT – array of text
    • VECTOR_INTEGER – array of integers
    • VECTOR_DECIMAL – array of decimals
    • VECTOR_DATE – array of dates
    • VECTOR_BOOLEAN – array of boolean
    • BINARY – binary data
    • ROWSET – nested data rowsets
  • Metadata – This class defines a set of MetaFields used in DataRow/DataRowSet
  • DataRow – This class abstracts a row of data fields
  • DataRowSet – This class abstracts a set of rows

PlexDataProviders also supports nested structures where a data field in DataRow can be instance of DataRowSet.

Adding a Data Provider

The data provider implements following two interfaces

[codesyntax lang="java"]
public interface DataProducer {
    void produce(DataRowSet requestFields, DataRowSet responseFields,
            QueryConfiguration config) throws DataProviderException;
}
[/codesyntax]

Note that QueryConfiguration defines additional parameters such as:

  • pagination parameters
  • ordering/grouping
  • filtering parameters
  • timeout parameters

The timeout parameter can be used to return all available data within defined time, e.g. query engine may invoke underlying data providers in multiple threads and if underlying query takes a long time then it would return available data.

[codesyntax lang="java"]
public interface DataProvider extends DataProducer, Comparable<DataProvider> {
    String getName();

    int getRank();

    Metadata getMandatoryRequestMetadata();

    Metadata getOptionalRequestMetadata();

    Metadata getResponseMetadata();

    TaskGranularity getTaskGranularity();
}
[/codesyntax]

Each provider defines name, rank (or priority when matching for best provider), set of mandatory/optional input and output data fields. The data provider can also define granularity as coarse grain or fine grain and the implementation may execute those providers on different threads.

PlexDataProviders also provides interfaces for converting data from domain objects to DataRowSet. Here is an example of provider implementation:

[codesyntax lang="java"]
public class SecuritiesBySymbolsProvider extends BaseProvider {
    private static Metadata parameterMeta = Metadata.from(SharedMeta.symbol);
    private static Metadata optionalMeta = Metadata.from();
    private static SecurityMarshaller marshaller = new SecurityMarshaller();

    public SecuritiesBySymbolsProvider() {
        super("SecuritiesBySymbolsProvider", parameterMeta, optionalMeta,
                marshaller.getMetadata());
    }

    @Override
    public void produce(DataRowSet parameter, DataRowSet response,
            QueryConfiguration config) throws DataProviderException {
        final String id = parameter.getValueAsText(SharedMeta.symbol, 0);
        Map<String, Object> criteria = new HashMap<>();
        criteria.put("symbol", id.toUpperCase());
        Collection<Security> securities = DaoLocator.securityDao.query(criteria);
        DataRowSet rowset = marshaller.marshal(securities);
        addRowSet(response, rowset, 0);
    }
}
[/codesyntax]

Typically, you will create data-provider for each different kind of query that you want to support. Each data provider specifies set of required and optional data fields that can be used to generate output data fields.

Here is an example of marshalling data from Securty domain objects to DataRowSet:

[codesyntax lang="java"]
public DataRowSet marshal(Security security) {
    DataRowSet rowset = new DataRowSet(responseMeta);
    marshal(rowset, security, 0);
    return rowset;
}

public DataRowSet marshal(Collection<Security> securities) {
    DataRowSet rowset = new DataRowSet(responseMeta);
    for (Security security : securities) {
        marshal(rowset, security, rowset.size());
    }
    return rowset;
}
...
[/codesyntax]

PlexDataProviders provides DataProviderLocator interface for registering and looking up provider, e.g.

[codesyntax lang="java"]
public interface DataProviderLocator {
    void register(DataProvider provider);

    Collection<DataProvider> locate(Metadata requestFields, Metadata responseFields);
...
}
[/codesyntax]

PlexDataProviders comes with a small application that provides data services by implementing various data providers. It uses PlexService framework for defining the service, e.g.

[codesyntax lang="java"]
@WebService
@Path("/data")
public class DataServiceImpl implements DataService {
    private DataProviderLocator dataProviderLocator = new DataProviderLocatorImpl();
    private QueryEngine queryEngine = new QueryEngineImpl(dataProviderLocator);

    public DataServiceImpl() {
        dataProviderLocator.register(new AccountsByIdsProvider());
        dataProviderLocator.register(new AccountsByUseridProvider());
        dataProviderLocator.register(new CompaniesBySymbolsProvider());
        dataProviderLocator.register(new OrdersByAccountIdsProvider());
        dataProviderLocator.register(new PositionGroupsBySymbolsProvider());
        dataProviderLocator.register(new PositionsBySymbolsProvider());
        dataProviderLocator.register(new QuotesBySymbolsProvider());
        dataProviderLocator.register(new SecuritiesBySymbolsProvider());
        dataProviderLocator.register(new UsersByIdsProvider());
        dataProviderLocator.register(new WatchlistByUserProvider());
        dataProviderLocator.register(new SymbolsProvider());
        dataProviderLocator.register(new UsersProvider());
        dataProviderLocator.register(new SymbolSearchProvider());
    }

    @Override
    @GET
    public DataResponse query(Request webRequest) {
        final DataRequest dataRequest = DataRequest.from(webRequest .getProperties());
        return queryEngine.query(dataRequest);
    }
}
[/codesyntax]

As you can see the data service simply builds DataRequest with input data fields and sends back response back to clients.

Here is an example client that passes a search query data field and requests quote data fields with company details

public void testGetQuoteBySearch() throws Throwable {
    String jsonResp = TestWebUtils.httpGet("http://localhost:" + DEFAULT_PORT
                    + "/data?responseFields=exchange,symbol,quote.bidPrice,quote.askPrice,quote.sales,company.name&symbolQuery=AAPL");
    ...

Note that above request will use three data providers, first it uses SymbolSearchProvider provider to search for matching symbols with given query. It then uses the symbol data field to request company and quote data fields from QuotesBySymbolsProvider and CompaniesBySymbolsProvider. The PlexDataProviders framework will take care of all dependency management for providers.

Here is an example JSON response from the data service:

[codesyntax lang="javascript"]
{
    "queryResponse": {
        "fields": [
            [{
                "symbol": "AAPL_X"
            }, {
                "quote.sales": [
                    [{
                        "symbol": "AAPL_X"
                    }, {
                        "timeOfSale.volume": 56
                    }, {
                        "timeOfSale.exchange": "DOW"
                    }, {
                        "timeOfSale.date": 1455426008762
                    }, {
                        "timeOfSale.price": 69.49132317180353
                    }],
                    [{
                        "symbol": "AAPL_X"
                    }, {
                        "timeOfSale.volume": 54
                    }, {
                        "timeOfSale.exchange": "NYSE"
                    }, {
                        "timeOfSale.date": 1455426008762
                    }, {
                        "timeOfSale.price": 16.677774132458076
                    }],
                    [{
                        "symbol": "AAPL_X"
                    }, {
                        "timeOfSale.volume": 99
                    }, {
                        "timeOfSale.exchange": "NASDAQ"
                    }, {
                        "timeOfSale.date": 1455426008762
                    }, {
                        "timeOfSale.price": 42.17891320885568
                    }],
                    [{
                        "symbol": "AAPL_X"
                    }, {
                        "timeOfSale.volume": 49
                    }, {
                        "timeOfSale.exchange": "DOW"
                    }, {
                        "timeOfSale.date": 1455426008762
                    }, {
                        "timeOfSale.price": 69.61680149649729
                    }],
                    [{
                        "symbol": "AAPL_X"
                    }, {
                        "timeOfSale.volume": 69
                    }, {
                        "timeOfSale.exchange": "NYSE"
                    }, {
                        "timeOfSale.date": 1455426008762
                    }, {
                        "timeOfSale.price": 25.353316897552833
                    }]
                ]
            }, {
                "quote.askPrice": 54.99300665695502
            }, {
                "quote.bidPrice": 26.935682182171643
            }, {
                "exchange": "DOW"
            }, {
                "company.name": "AAPL - name"
            }],
            [{
                "symbol": "AAPL"
            }, {
                "exchange": "NASDAQ"
            }]
        ],
        "errorsByProviderName": {},
        "providers": ["QuotesBySymbolsProvider", "SymbolSearchProvider", "CompaniesBySymbolsProvider"]
    }
}
[/codesyntax] 

PlexDataProviders is available from github and is licensed under liberal MIT license. It also comes with a small sample application for demo purpose. Feel free to send me your suggestions.

 

December 23, 2007

Released ErlSDB 0.1

Filed under: Erlang,SimpleDB,Web Services — admin @ 7:09 pm

I started working on an Erlang library to access Amazon’s SimpleDB web service and I released an early version of the library this weekend. Here are some notes on its usage:
Installing

svn checkout http://erlsdb.googlecode.com/svn/trunk/ erlsdb-read-only

Building

make

Testing

edit Makefile and add access key and secret key, then type make test

Usage

Take a look at test/erlsdb_test.erl to learn usage, here is a sample code

Starting Server

erlsdb:start(type,
    	[#sdb_state{
		access_key = "YourAccessKey",
		secret_key = "YourSecretKey",
		domain = "YourDomain"
		}
	])

Creating Domain

    erlsdb:create_domain()

Note that the server will use the domain that was passed during initialization.

Listing all Domains

    {ok, List, _} = erlsdb:list_domains()

Deleting Domain

    erlsdb:delete_domain()

Adding an item

    Attributes = lists:sort([
	["StreetAddress", "705 5th Ave"],
        ["City", "Seattle"],
        ["State", "WA"],
        ["Zip", "98101"]
	]),
    erlsdb:put_attributes("TccAddr", Attributes)

Retrieving an item

    {ok, UnsortedAttrs} = erlsdb:get_attributes("TccAddr")

Deleting an item

    erlsdb:delete_attributes("TccAddr"),

Powered by WordPress