design – Building a Microservices App — Can you give feedback on architecture?

I did some googling, and I was directed to Software Engineering to ask architecture questions. If you know of a different forum that could help me, please direct me to it

I recently started learning about microservices, and would like to build an experimental app (the backend) just for practice. I’ll explain the app requirements, and after that outline my microservices-based solutions (and some doubts/questions I have). I’d love to get your feedback, or your approach to building this app using microservices.

Please note: I am a beginner when it comes to microservices, and still learning. My solution might not be good, so I’d like to learn from you.

The App (Silly App):

The purpose of this app is to make sure users eat carrots four times a week. App admins create a carrot eating competition that starts on day x and ends 8 weeks after day x. Users can choose whether or not to participate in the competition. When a user joins the competition, they need to post a picture of themselves eating a carrot. The admin approves/rejects the picture. If approved, the carrot eating session counts towards the weekly goal, otherwise it does not. At the end of each week, participating users are billed $10 for each carrot eating session they missed (for example, if they only eat carrots two times that week, they’re billed $20). That $20 goes into a “money bucket”. At the end of two months, users who successfully ate carrots four times a week every single week divide the money in the bucket among themselves. For example, assume we have users A, B, C. User A missed all carrot eating sessions for two months (puts $40 a week in the money bucket, so $320 by the end of two months). Users B and C eat their carrots four times a week consistently for two months. So users B and C take home $320/2 = $160.

Simplification:
I wanted to start simple. Forget about money. Forget about admin approval. We can add that later. For now, let’s focus on a very simplified version of the app.

  • User can signup/login/logout to app
  • When a user signs up, they are automatically enrolled into the next carrot eating competition
  • Users can post an image of him/herself eating a carrot
  • Users can see a feed of other users images (similar to instagram, except all pics are of people eating carrots)
  • Users can access their profile – a page that displays how they’re doing in the competition: I.e,for each week, how many carrots they ate. And which weeks they failed at.
  • At any point in time, users can access a page that shows who the current winners are (i.e, users who did not miss a carrot eating session yet).

Is this an appropriate simplification to start with?

Thinking Microservices – Asynchronous Approach:

Auth Service: Responsible for Authenticating User

Database:

  • User Table: id, username, email, password

Routes:

  • POST /users/new : signup
  • POST /users/login: login
  • POST /users/signout: signout

Events:

Image Service: Responsible for Saving Images (upload to Amazon S3)

Database:

  • User Table: userId, username
  • Image Table: imageId, userId, dateUploaded, imageUrl

Routes:

  • POST /users/:userId/images: Post new image
  • GET /users/:userId/image/:imageId: Return a specific image
  • GET /images: Return all images (Feed)

Events:

  • Publishes:
    • Image:created (userId, imageId, imageUrl, dateUploaded)

Competition Service: Responsible for managing competition

Database:

  • Competition table: id, startDate, duration
  • User table: id, username, competitionId, results (see below)

Routes:

  • POST /competition: create a competition
  • GET /competition/:competitionId/users/:userId: get results for a specific user
  • GET /competition/:competitionId/users: get a list of users participating in competition (see below)
  • GET /competition/:competitionId: get a list of winners, and for each looser how many workouts they missed

Events:

  • Listens:
    • User:created
    • Image:created

In the database, user table, Results is the JSON equivalent of

results = {
   week1: {
       date: 'oct 20 2020 - oct 27 2020',
       results: ('mon oct 20 2020', 'tue oct 21 2020', 'thur oct 23 2020'),
   },
   week2: {
       date: 'oct 28 2020 - nov4 2020',
       results: ('somedate', 'somedate', 'somedate', 'somedate'),
   },
   week3: {
       date: 'nov 5 2020 - nov 12 2020',
       results: (),
   },
   ...
}

Better ideas on how to store this data appreciated

GET /competition/:competitionId returns

const results: {
 winners: ({ userId: 'jkjl'; username: 'jkljkl' }, { userId: 'jkjl'; username: 'jkljkl' });
 loosers: (
   { userId: 'kffjl'; username: 'klj'; carrotDaysMissed: 3 },
   { userId: 'kl'; username: 'kdddfj'; carrotDaysMissed: 2 }
 );
};

What do you think of this? How would you improve it? Or would you approach this from an entirely different way?

c# – EF Migrations on Microservices architecture

I’m starting get into in microservices architecture and I want to know where is the right place for EF migrations files?
I’m working like this:

introducir la descripción de la imagen aquí

I have seen the EF Migrations files inside of API service, but I really do not know where should I put them…
Which are the best practices to do this?

microservices – Where to place an in-memory cache to handle repetitive bursts of database queries from several downstream sources, all within a few milliseconds span

I’m working on a Java service that runs on Google Cloud Platform and utilizes a MySQL database via Cloud SQL. The database stores simple relationships between users, accounts they belong to, and groupings of accounts. Being an “accounts” service, naturally there are many downstreams. And downstream service A may for example hit several other upstream services B, C, D, which in turn might call other services E and F, but because so much is tied to accounts (checking permissions, getting user preferences, sending emails), every service from A to F end up hitting my service with identical, repetitive calls. So in other words, a single call to some endpoint might result in 10 queries to get a user’s accounts, even though obviously that information doesn’t change over a few milliseconds.

So where is it it appropriate to place a cache?

  1. Should downstream service owners be responsible for implementing a cache? I don’t think so, because why should they know about my service’s data, like what can be cached and for how long.

  2. Should I put an in-memory cache in my service, like Google’s Common CacheLoader, in front of my DAO? But, does this really provide anything over MySQL’s caching? (Admittedly I don’t know anything about how databases cache, but I’m sure that they do.)

  3. Should I put an in-memory cache in the Java client? We use gRPC so we have generated clients that all those services A, B, C, D, E, F use already. Putting a cache in the client means they can skip making outgoing calls but only if the service has made this call before and the data can have a long-enough TTL to be useful, e.g. an account’s group is permanent. So, yea, that’s not helping at all with the “bursts,” not to mention the caches living in different zone instances. (I haven’t customized a generated gRPC client yet, but I assume there’s a way.)

I’m leaning toward #2 but my understanding of databases is weak, and I don’t know how to collect the data I need to justify the effort. I feel like what I need to know is: How often do “bursts” of identical queries occur, how are these bursts processed by MySQL (esp. given caching), and what’s the bottom-line effect on downstream performance as a result, if any at all?

I feel experience may answer this question better than finding those metrics myself.

Asking myself, “Why do I want to do this, given no evidence of any bottleneck?” Well, (1) it just seems wrong that there’s so many duplicate queries, (2) it adds a lot of noise in our logs, and (3) I don’t want to wait until we scale to find out that it’s a deep issue.

microservices – Service integration with large amounts of data

I am trying to assess the viability of microservices/DDD for an application I am writing, for which a particular context/service needs to respond to an action completing in another context. Whilst previously I would handle this via integration events published to a message queue, I haven’t had to deal with events which could contain large amounts of data

As a generic example. Let’s say we have an Orders and Invoicing context. When an order is placed, an invoice needs to be generated and sent out.

With those bits of information I would raise an OrderPlaced event with the order information in, for example:

public class OrderPlacedEvent
{
    public Guid Id { get; }
    public List<OrderItem> Items { get; }
    public DateTime PlacedOn { get; }
}

from the Orders context, and the Invoicing context would consume this event to generate the required invoice. This seems fairly standard but all examples found are fairly small and don’t seem to address what would happen if the order has 1000+ items in the order, and it leads me to believe that maybe integration events are only intended for small pieces of information

The ‘easiest’ way would be to just use an order ID and query the orders service to get the rest of the information, but this would add coupling between the two services which the approach is trying to remove.

Is my assumption that event data should be minimal correct? if it is, how would I (or even, is it possible to?) handle such a scenario where there are large pieces of data which another context/service needs to respond to, correctly?

Multitenancy – Get tenant in microservices architecture

In a multi-tenant architecture where each tenant has its own database, what would be the best way for each of the microservices to obtain information about the tenant (such as which database to connect to)?

We are still in the modeling phase of the application. We are having difficulty to come up with the best design. We are currently thinking about doing this:

  1. The user logs in to the application. Our authentication server
    (IdentityServer 4) returns the user’s information along with the
    tenant_id he belongs to (this information is in a claim).

  2. The application directs the user to a specific tentant subdomain.

  3. The frontend communicates with an API Gateway. Through a Claim,
    the API Gateway knows the tenant through the claim of the logged user.

  4. The API Gateway calls each of the microservices calling the
    tenant_id as a parameter in the endpoint.

  5. The microservice receives tenant_id as a parameter and executes a
    query in Redis to obtain the connection string from the database.

We don’t know if this is the best solution. We think it is being a lot of work to have to inform tenant_id as parameter in all endpoints of all microservices. We appreciate all the help and suggestions.

python – Docker Swarm Constraint Service (Microservices)

I’ve written my first “real” Python application. I’ve never worked much with Python before, so I’d like to receive feedback on how I can structure the application to make it follow the way Python programs usually are.

This is very much from a “readability” perspective, but getting feedback on general structure as well as my use of classes, naming of methods and use of comments would be great to get other’s opinion on.

https://github.com/sbrattla/swarmconstraint

The use case for this application is real. I manage a Docker Swarm, but I need a service to only run in a couple of nodes participating in that swarm. However, if I use placement constraints (constraint a service to specific nodes) and the nodes which the service is constrained to go down – then the service goes down as well. So, the application will remove the placement constraints if specified nodes goes down so that the service can “fallback” to other nodes.

#!/usr/bin/python3

import argparse
import docker
import json
import logging
import re
import string
import time

class SwarmConstraint:

  def __init__(self, args):
    self.args = args
    self.initClient()

    self.logger = logging.getLogger(__name__)
    handler = logging.StreamHandler()
    formatter = logging.Formatter(
        '%(asctime)-25s  %(levelname)-8s  %(message)s')
    handler.setFormatter(formatter)
    self.logger.addHandler(handler)
    self.logger.setLevel(logging.DEBUG)

    if (not self.args('watch')):
      raise Exception('At least one node to watch must be provided.')

    if (not self.args('toggle')):
      raise Exception('At least one node to toggle must be provided.')

    if (not self.args('label')):
      raise Exception('At least one label must be provided.')

    if (not self.args('prefix')):
      raise Exception('A prefix must be provided.')

    self.logger.info('Watch {watch}.'.format(watch=','.join(self.args('watch'))))
    self.logger.info('Toggle the label(s) {labels} on {toggle}.'.format(labels=','.join(self.args('label')), toggle=','.join(self.args('toggle'))))
    self.logger.info('Prefix disabled labels with {prefix}.'.format(prefix=self.args('prefix')))

  def run(self):

    # Collect availability for watched nodes, and keep track of the collective
    # availability for all the watched nodes.
    nodes = self.getNodes()
    allWatchedNodesUnavailable = True
    for nodeId in nodes:
      watchNode = nodes(nodeId)
      if (not self.args('watch') or watchNode('hostname') not in self.args('watch')):
        continue

      if (self.isNodeAvailable(watchNode) == True):
        allWatchedNodesUnavailable = False
        break;

    if (allWatchedNodesUnavailable):
      self.logger.warn('All watched nodes are unavailable.')
    else:
      self.logger.debug('One or more watched nodes are available.')

    # Disable or enable labels depending on the collective availability for all
    # the watched nodes. 
    for nodeId in nodes:
      toggleNode = nodes(nodeId)
      if (self.args('toggle') and toggleNode('hostname') not in self.args('toggle')):
        continue

      if (allWatchedNodesUnavailable):
        self.disableLabels(toggleNode, self.args('label'), self.args('prefix'))
      else:
        self.enableLabels(toggleNode, self.args('label'), self.args('prefix'))

  def getSocket(self):
    return 'unix://var/run/docker.sock'

  def initClient(self):
    # Initialize the docker client.
    socket = self.getSocket()
    self.client = docker.DockerClient(base_url=socket)

  def getNodes(self):
    # Returns all nodes.
    allNodes = self.client.nodes.list();
    allNodesMap = {}
    for node in allNodes:
      allNodesMap(node.id) = {
        'id' : node.id,
        'available' :  True if node.attrs('Spec')('Availability') == 'active' else False,
        'hostname': node.attrs('Description')('Hostname'),
        'role' : node.attrs('Spec')('Role'),
        'platform' : {
          'os' : node.attrs('Description')('Platform')('OS'),
          'arch' : node.attrs('Description')('Platform')('Architecture')
        },
        'labels' : node.attrs('Spec')('Labels'),
      }

    return allNodesMap

  def isNodeAvailable(self, node):
    return node('available')

  def disableLabels(self, node, labels, prefix):
    # Disable labels on a node by adding a prefix to each label. The node will only be
    # updated if at least one of the provided labels are currently enabled.
    matchingNode = next(iter(self.client.nodes.list(filters={'id':node('id')})), None)
    if (matchingNode is None):
      return

    spec = matchingNode.attrs('Spec')
    update = False

    for label in labels:
      if (label not in spec('Labels')):
        continue

      nodeLabelKey = label
      nodeLabelVal = spec('Labels')(nodeLabelKey)
      spec('Labels').update(self.prefixNodeLabel(nodeLabelKey, nodeLabelVal, prefix))
      spec('Labels').pop(nodeLabelKey, None)
      update = True

      self.logger.info('Disabling the label "{key}={val} on {node}".'.format(key=nodeLabelKey, val=nodeLabelVal, node=node('id')))

    if (update):
      matchingNode.update(spec)
      return True
    else:
      return False

  def enableLabels(self, node, labels, prefix):
    # Enable labels on a node by removing the prefix from each label. The node will only be 
    # updated if at least one of the provided labels are currently disabled.
    matchingNode = next(iter(self.client.nodes.list(filters={'id':node('id')})), None)
    if (matchingNode is None):
      return

    spec = matchingNode.attrs('Spec')
    update = False

    for label in labels:
      label = self.prefixLabel(label, prefix)
      if (label not in spec('Labels')):
        continue

      nodeLabelKey = label
      nodeLabelVal = spec('Labels')(nodeLabelKey)
      spec('Labels').update(self.unPrefixNodeLabel(nodeLabelKey, nodeLabelVal, prefix))
      spec('Labels').pop(nodeLabelKey, None)
      update = True

      self.logger.info('Enabling the label "{key}={val} on {node}".'.format(key=nodeLabelKey, val=nodeLabelVal, node=node('id')))

    if (update):
      matchingNode.update(spec)
      return True
    else:
      return False

  def prefixLabel(self, label, prefix):
    # Split and prefix a label into a dictionary holding the prefixed key and the value separately.
    return '{prefix}.{key}'.format(prefix=prefix, key=label)

  def isNodeLabelPrefixed(self, key, prefix):
    # Evaluates if a node label is prefixed
    return True if key.find(prefix) > -1 else False;

  def prefixNodeLabel(self, key, val, prefix):
    # Prefix a node label.
    label = {'{prefix}.{key}'.format(prefix=prefix,key=key) : '{val}'.format(val=val)}
    return label

  def unPrefixNodeLabel(self, key, val, prefix):
    # Remove prefix from a node label.
    key = key.replace('{prefix}.'.format(prefix=prefix), '')
    label = {'{key}'.format(prefix=prefix,key=key) : '{val}'.format(val=val)}
    return label

class FromFileAction(argparse.Action):

  def __init__(self, option_strings, dest, nargs=None, **kwargs):
    super(FromFileAction, self).__init__(option_strings, dest, **kwargs)

  def __call__(self, parser, namespace, path, option_string=None):
    if (path):
      data = None
      with open(path) as f:
        data = json.load(f)

      if (data is None):
        return

      if ('watch' in data):
        namespace.watch += data('watch')

      if (data('toggle')):
        namespace.toggle += data('toggle')

      if ('label' in data):
        namespace.label += data('label')

    return

def main():
  parser = argparse.ArgumentParser(description='Toggles one or more constraints depending on node availability')
  parser.add_argument('--watch', metavar='watch', action='append', default=(), help='A node which availability is to be watched.')
  parser.add_argument('--toggle', metavar='toggle', action='append', default=(), help='A node for which constraints are to be toggled. Defaults to all nodes.')
  parser.add_argument('--label', metavar='label', action='append', default=(), help='A label which is to be toggled according to availability for watched nodes.')
  parser.add_argument('--prefix', metavar='prefix', default='disabled',  help='The prefix to use for disabled labels. Defaults to "disabled".')
  parser.add_argument('fromFile', action=FromFileAction, help='A file which holds configurations.')

  args = vars(parser.parse_args())
  se = SwarmConstraint(args)

  while(True):
    try:
      se.run()
      time.sleep(10)
    except KeyboardInterrupt:
      break
    except Exception as err:
      print(err)
      break

if __name__ == '__main__':
  main()

architecture – How to abstract a payment service to make developing new payment microservices faster

I am part of a team that is primarily working on payments integration. Creating a new microservice to handle a new payment type and integration takes us so much time and involves a lot of boilerplate and duplication especially when it comes to creating cloud infrastructure.

We currently build out our services on the AWS platform. We use Terraform for codifying
our infrastructure and we build all services using the microservices pattern. We do this currently by using AWS route53 latency routing to either to a backing AWS API Gateway or Load balancer that then routes requests in a round-robin fashion to docker containers in AWS ECS or AWS Fargate clusters. The domain routing is made up of a global domain and then regional domains mapped to the regional services (From the API Gateway or ALB inwards into the system, most infrastructure are regional infrastructure). We use AWS WAF and Shield for security.

We have somehow agreed on a way to optimise the infrastructure part. What is remaining now is the functional parts that run within the containers. We build those in Javascript/NodeJS/Express and store data in AWS DynamoDB. Most of our code is functional javascript code where Classes are used minimally or not at all.

I have recommended the use of OOP features like Interfaces/Abstract Classes/Inheritance. As the concept of Interfaces is not available in Javascript, I can see how promising the Inheritance feature can be for this if we create a base class that has unimplemented or default implementations of common payment service functionalities as functions. All the concrete payment types (e.g. Direct Debit, Credit Card, PayPal, etc) can inherit this base class and override these common functions while using mixins to extend a class’ functionality with functionality from other useful utility classes or objects that might be required in multiple places or by multiple payment methods (but not all of them).

The issue is that a key member of the team believes in sticking to the functional approach as much as is possible.

Please provide your thoughts and propose a better approach if you can. Is it also possible to use mixins alone and drop using OOP’s inheritance altogether in this case?

Thank you very much for your responses in advance.

architecture – DDD in microservices – where to draw the line of responsibility of a microservice?

It is pretty common for one microservice to need data from another microservice to service their consumers (in the form of API responses). Fetching data synchronously couples the two microservices tightly. Leaving it to the consumer to fetch data from another Microservice can lead to chatty APIs and slow response time.

A few conventions to confirm your domain model and push it further to arrive at the solution you are looking for:

Bounded Contexts make good Microservice boundaries.

One Bounded Context (BC) can contain more than on microservice, but one microservice should never span across BCs. A domain concept makes sense only when considered within a BC. It may mean something else in a different BC.

Your boundaries – Product Management and “Recommendation” Engine- seem to be correct, IMHO.

The concept of Product may mean one thing in the Product Management Microservice but can be subtly different in the Recommendation Microservice. The differences can be structural or behavioral.

Microservices share nothing.

All data and APIs related to the Microservice are enclosed within it. If another Microservice needs this data, they are exposed as well-defined services (RPC-based communication, for example) or as APIs (REST-based, for example). Accessing another microservice’s data via the database is strictly forbidden.

Microservices are connected over a common message channel.

Data points that are related to multiple microservices are published on a common channel as Events. Interested Microservices have subscribers watching out for the event, pick it, and process it for internal use. In DDD parlance, these are Domain Events.

An Aggregate in one Microservice could be a Value Object in another.

Product Management BC is the owner of Product-related data. Other microservices may retain/cache portions of that data within their boundaries (like you are doing with Product IDs, in your case).

Read models can be used to serve APIs with different needs.

You can populate a read model with data prepped and ready to be served in API responses. In your example, you would have a row (or multiple rows) per user in the read model with ready-to-ship data in the Recommendation Microservice.

There can be more than one read model per data structure, as dictated/required by API responses.

It’s perfectly valid to construct and store data in different formats to cater to different APIs. You would use Domain Events with a pub-sub model to populate these read-only data structures in the background.

An API request should be handled in entirety by one single Microservice.

Unless you are using reactive architectures and you can gather data from microservices in parallel, you are better off dealing with the request within one single Microservice in entirety.


So there is a third option of storing a copy of Product data (only what you need) as part of the Recommendation Microservice and using it when constructing the response for Hot Products.

The Product data here is treated as a cache, populated in the background (typically by listening to events being published from the Product Management Microservice), and should be reconstructable in entirety. Most importantly, the Recommendation engine should treat this data as read-only, and not add any additional metadata into it.

microservices – Passing an OAuth Token between services with Zero Trust and audience checks

Let’s say, we’re using an OAuth / OpenID Connect (OIDC) flow (in a Zero-Trust situation) to secure two APIs: ServiceA and ServiceB. To implement some of the functionality of ServiceA, it depends on Service B. ServiceA is calling ServiceB on behalve of the end user.

How would we deal with tokens in this situation:

  • The end user does not need to know that ServiceA is using ServiceB (implementation hiding)
  • The end user gets a token from an IDP, that both Services trust
  • The Services are developed by two separate teams in a large corporate enviroment, with Zero Trust. That means that ServiceB doesn’t (completely) trust ServiceA.
  • The end user would authenticate at the IDP and pass the token to ServiceA.
  • ServiceA verifies the token with the IDP and the IDP checks the audience.

But now the hard part:

  • ServiceA wants to call ServiceB and let ServiceB know (Zero Trust) that it is doing so on behalf of the end user.
  • ServiceA cannot use the token it got from the end user, because that has ServiceA as an audience. ServiceB will not be able to verify that token.
  • We could use the same audience for both services (since we’re all one company). However, in a typical corporate environment you could have hundreds of services and you don’t want to put them all in the same audience.

A similar question has been asked on https://stackoverflow.com/questions/39839881/is-it-ok-to-pass-on-oauth-access-token-between-services, but that ignores the audience in the token.

architecture – Handling database information for tenants in microservices

Quick background: A multitenant SAAS with a database per tenant. The general idea is that a user will login (authentication with Auth0), the token passed from Auth0 to our microservices will contain a claim with the users tenantId. Based on this tenantId we query a database to get the tenants database information.

My current idea is that when a microservice is called, it calls an Account microservice passing the users tenantId to receive database information. The account service itself uses an SQL database and Redis for caching.

  1. Is this the “correct” way to handle different connections for users (If not, what would be “better”)?
  2. How should the messaging between services look for retrieving this database information work? Should the other services also try to read from the Redis cache before trying to call the Account service, if the information does not exist in cache(this would require rewriting very similar code in almost every microservice)? Or should the services use a message broker (RabbitMQ for example) to ask for the information from that Account service every time?