mindlets

Sunday, November 01, 2009

supporting multitenancy in the application framework

Multitenancy is that characteristic of an application that allows one deployment of it to be used by multiple organizations. Wikipedia provides a nice, high-level introduction.

The historical approach is to build multitenancy support in the application. This means every resource access by a user needs to take into which tenant the user belongs to. This can be very complex, especially what the feature needs to be retrofitted into any existing code base.

A second approach is to use virtualization. Each tenant has its own operating system instance. This has the very significant benefit of not introducing complexity at the application level. But the complexity moves to the management of virtual images, especially if the application is distributed (tiered, or scaled), or if the number of tenants is high, and changes often. Also, in an environment with a large number of tenants, the cost of one VM per tenant (scaled) may be prohibitive.

A possible third approach - to be investigated here - is to build multitenancy support into the application framework. This frees application developers from having to worry about it, and does not have the cost and complexity associated with managing a large number of virtual instances. But, complexity is obviously added to the application framework, and applications deployed on the framework are constrained in the way they access resources; they HAVE to use the framework's APIs, and of course the framework APIs have to exist and be multitenant aware for every type of resource consumed by the app.

The following is nothing more than a brainstorm, in hopes of getting a clearer picture of what approach 3 means, in the case of a JEE web app. It contains no solutions. Analysis of each potential pitfall is superficial. The list of potential pitfalls is not exhaustive.

Request handling and tenant context

The tenant needs to be identified by the time the framework hands off the request to the application. This tenant context is then needed whenever a shared resource is accessed, so the resource provider knows which tenant is requesting access.

Ideally, the application code is not aware of the tenant context. At the same time, it must be available to all tenant-aware resource providers.

Thread-local storage works well enough in the synchronous model. When the application is asynchronous, a special executor service implementation must be used to ensure that the tenant context is appropriately stored along with the task, and that the tenant context is added to thread local local storage executing the task. A bit messy...

Database Access and schema

The framework must expose a multitenancy-aware JPA implementation. How this gets done depends on schema choices. Multiple approaches can be investigated here:

* One persistence unit per tenant. Each persistence units targets a distinct database instance. The may result in a large number of databases and JPA scalability in terms of number of persistence units would have to be asserted. In this case the JPA wrapper only has to control creation of entity managers.

* One global database, tenant-specific tables. The JPA wrapper intercepts query and persist operations, and qualifies the table. Database scalability in terms of number of tables needs to be asserted. A mix of this and the above approach may also be considered.

* Global tables with a hidden tenant key. In this case the JPA wrapper intercepts queries and adds the tenant column (in WHERE conditions and INSERT statements). This seems like the most complex in terms of the framework, and introduces additional storage (the tenant column). This is the approach chosen by Grails.

Note: The request context (containing the tenant information) needs to be available to JPA wrapper implementation when it creates or retrieves an entity manager. This information needs to be passed out of band (e.g. thread-local) which is an architecture constraint on the application.

System APIs

File system access, network IO, ...

Overriding file system features of java.io and java.nio packages, or preventing their direct use, is required to support multitenant filesystem access, as different tenants must transparently access different areas of the file system.

Overriding network access features may be necessary in order to ensure fair access to network or thread resources.

Application configuration, monitoring

In the single tenant case, it is common practice to store application configuration in files accessible as resources. This does not work in a multitenant situation. Instead, applications need to consume a tenant-aware framework API.

Creating and using a tenant-aware JMX provider may be worth investigating here.

Logging

The Java logging API supports the configuration of application defined log appenders. application developers may choose only from a list of tenant-aware appenders. The framework sets tenant-aware defaults.

External Service Access

Requests to third party services must appear as coming from a tenant. authentication data, such as an OAuth access token cache, must be access in a tenant-aware fashion.

JMS

When sharing a JMS provider among tenants, the following situation should be considered:
* namespace management. 2 tenants should be able to use the same application-level topic/queue name.
* fair resource utilization. One tenant's event activity should be constrained so as not to impact others.

The framework can inject a tenant-aware JMS client implementation, which could be a wrapper around an existing one. In order to ensure fair resource utilization, the JMS broker needs to be able to impose limits, and may need to be tenant-aware.

Libraries

Some library implementations may be incompatible with a multitenant environment. Things to watch out for include singleton configuration or data objects.

Bottom line (so far)

In most cases, Java allows the application to use their own implementation of a particular API. For these cases, it is possible for the framework to provide tenant-aware implementations.

In a JEE application, the container must be tenant aware. This requires modifications of the container: the servlet container itself in case the servlet API is used directly, or the higher-level framework on which the application is built.

Base java APIs (IO, net) are not easily pluggable. This requires that the framework expose alternate APIs.

In practice, applications running on a multitenant framework will be restricted in what they can use (code, resources), as everything they use must be multitenant compatible. How restricted will depend on the context. Google App Engine provides an idea of what these restrictions are.

Things may be less restricted in a more controlled environment in which the tenants are applications created by a know group, e.g. services created by a company and deployed on a common framework.

Labels: j2ee, java, jee, multitenancy

Monday, February 23, 2009

Tuesday, November 25, 2008

SushiPoll

Sunday, September 14, 2008

facebook and death

Just checked my Facebook profile. Once nice feature of Facebook is that it shows you the picture of 6 of your friends, picked randomly (or using some algorithm I have not figured out). I often use it to find out what so and so has been up to lately. All in all it's probably the feature I and many others use the most.

Today, my friend and former colleague Jonathan popped up. Jonathan tragically passed away earlier this summer, at the tender age of 39. Here he is, smiling with a beach in the background. After staring at it for a bit, I clicked and reread some of the nice things people had written on his wall. I also went through the pictures, remembered some shared moments, wondered how his family is dealing. I thought about writing on his wall, but did not really feel up to it. Maybe some day. In the end, I did not get less from visiting Jonathan's profile as I would from visiting anyone else's. Fewer news, but more remembrance and meditation perhaps.

This got me thinking. What happens to your profile when you die? In the case of facebook, it looks like it's business as usual. Everything works. You can actually send friends requests too.

Facebook used to take deceased people's profiles offline and changed the policy by popular demand. That's good. Perhaps a clearer indication of status could be provided, and some features, like messaging or friending, turned off at the request of the family or will executors.

Also, what happens after years of inactivity from the account owner? Some sites freeze or terminate your account if you don't use the site for a long time. Does facebook have such a policy? Will it? If so, will my deceased friends disappear from my profile without asking?

At some point social networks are going to have to deal with this question. Whatever happens, I hope facebook does not take my deceased friends away without asking me.

Labels: death, facebook, social network

Friday, September 12, 2008

open hack 08

I was one of the lucky ones to participate in Yahoo!'s open hack day today. Yeah! I got a t-shirt, a thermos bottle, and a whole bunch of stickers. I got to learn some new stuff as well.

The goal of the 2-day event is mostly to introduce the latest API developments at Yahoo!, to the web developer community. Among the represented technos where flickr, fire eagle, yahoo's mail, music, search, geo, apis, and more. I could not see everything, and focused mostly on learning about Yahoo! Application platform, which the Yahoo! folks let us preview.

The general idea is this: 1. Centralize all the profile and social graph information currently help in the various Yahoo! properties. 2. Provide APIs to access all that good stuff. 3. Align these APIs with the opensocial spec whenever applicable.

As an opensocial application developer, the prospect of the giant Yahoo becoming the next opensocial container on the block is fascinating. This can potentially make myspace - the current behemoth - look like like an upstart.

Well, not so fast. There are a lot of tricky problems to solve, not the least of which is how you merge the relationships you have built on flickr, yahoo messenger, contacts, 360, etc... Still, this is very promising, so I decided to turn one of my zembly widgets into a yap opensocial application. Unfortunately, my initial excitement was quickly tempered by the following setbacks.

First, YAP does not take a gadget XML as input. Instead you have to give it a URL to the html page. No biggie. Juts give it the widget iframe link

Next obstacle: no external script is allowed. My widget used prototype.js methods, so I took out the prototype dependency and implemented replacements in my widget. Good thing I was not using much. I also had to inline in my widget the zembly gadget time (things-gadgets.js).

Next obstacle: YAP uses Caja... And Caja did not like many things about my widget. If you claim that your widget is xhtml, as in

<html xmlns="http://www.w3.org/1999/xhtml">

it will parse it as xml. It tripped on the first lower-than comparison in the script, thinking it was a new element.

if (cur < startIndex) {

The trick here was to convert the statement into a greater-than comparison.

if (startIndex > cur) {

Then Caja fussed about a bunch of javascript syntax issues. One Caja rule I ran against is the global scope leaking problem. Supporting Caja in zembly is going to require some work on the part of widget developers.

In the end, some of the caja errors where due to zembly the widget runtime, which I could not hack, so my first yap app never ran. I was able to turn my code into something caja compliant. In the process I found the caja test bed to be very useful.

Aside from hacking, another interesting talk was about an experiment by a Yahoo! team in building a complete app on myspace. Here is a summary of their findings was:

Pleasure:
* Documentation (opensocial, myspace)
* Plain standard stuff (no FBML, FBJS, or other proprietary language)
* ability to quickly move to another container

Pain
* unpredictable caching
* platform outages
* lack of communication regarding changes in platform
* inconsistent spec file support between containers
* UI load time. It takes time to go and fetch the data after load and it does affect the user experience. The team even suggested server-side UI rendering backed up with REST services calls, and combined with doing a document.write() in the gadget UI code.

Labels: caja, openhack, opensocial, yahoo

Wednesday, August 16, 2006

cooking shows in the participation age

It has become exceedingly easy for anyone to post any pointless pile of bytes for any bored soul out there to download. That's what smart people call the participation age.

In any case, the other night, being one of these bored and sleepless souls, I wandered to youtube to see what was hot. This is how I found Mr Cook.

Mr Cook is a guy who makes videos of himself cooking in his kitchen and talking the whole time like the TV chefs. Like for most TV cooking shows, the culinary value of his recipes is at best questionable, however the delivery is - I am almost ashamed to admit it - simply impeccable. It is packed silly with the sort of teenage humour I somehow never grew out of. I never thought I could become cooking show fanatic...

There are a dozen or so show (at the time of this writing) you can get on youtube and learn how to make Nachos, Sushis, Pizza, and more.

Ok, enough talking. If you don't know Mr Cook, you can start out by watching Mr Cook make a baby, which I suppose is a famous Scottish recipe.

Warning: if you are an actual cooking show fanatic, you may be offended by this video.

Friday, August 04, 2006

Personal Pubsub

The Jabber community just came up with a very elegant way to give IM users their own multi content news (aka publish/subscribe) service: JEP-0163 Traditional IM use of a personal pubsub service include avatar distribution, music or mood sharing. Yet the possibilities are limitless: one could use this mechanism to share all kinds of domain-specific information. XMPP-enabled Call Center systems could allow agents to selectively share with others information about current calls, questions they are not able to answer, etc... XMPP-enabled blog sites can notify your buddies of your new posts.

This mechanism is also really good for bots. Without it, the only client-independent ways a bot has to communicate with you is the presence status (show and status), or vanilla messages (natural language in subject and body) to individual folks. You can still make extensions for controlled environments, but this does not solve the issue of overloading presence and having the bot keep track of who to send what to.

The protocol is based on the rich (but complex) XMPP publish/subscribe protocol, with some simplifications taking into account the fact that we are talking specifically about persons. But what is really interesting about the proposal is how it leverages presence subscriptions and entity capabilities to become really simple in the very common case where events are shared with your buddies. It's like this: By virtue of subscribing to someone's presence, and advertizing the capability to handle a specific type of content, you are implicitely subscribed to content of this type published by this person. In other words, just by sharing one capability (one xml element in a previously shared presence stanza), you are automatically subscribed to all of your buddy's content corresponding to this capability. Can this get easier or more efficient? I doubt it. A very nice way to build on existing protocols.