Categories
Thoughts on Open Source, Data Services, and Integration
Successful SaaS requires open source distribution economics…
Posted 18 July, 2008 by
Chris in
SaaS, SugarCRM
1 Comment
Sarah Lacy’s article in Business Week is a sobering reminder of the challenges software companies face building on-demand businesses. The point made vividly clear here:
“SAP thought customers would go to a Web site, configure it themselves, and found the first hundred or so implementations required a lot of time and a lot of tremendous costs,” Richardson says. “Small businesses are calling for support, calling SAP because they don’t have IT departments. SAP is spending a lot of resources to configure and troubleshoot the problem.”
Nick Carr chimes in here with the point:
Anyone who thinks the software-as-a-service business is a gold mine is wrong. The economics are fundamentally different from those of the traditional software business - and not in a good way
I spoke yesterday to a SaaS platform provider who’s business is to help ISVs deliver their solutions as a service. They described some of the challenges their customers face being multi-tenancy, scalability, security, APIs, not to mention simply cramming the features into a browser.
I’ve watched other companies launch on-demand offerings with great fan fare only to find later that the offering languished among their traditionally delivered alternative. Then, after facing the brutal reality of the SaaS model purge all references to the SaaS offering (don’t they know there’s a Wayback Machine?).
As Sarah notes, there’ s no putting the genie back in the bottle. So what’s a company to do?
One point that Sara mentions but doesn’t fully explore is the role of open source in SaaS. There are many dimensions to this including how SaaS companies have leveraged open source to more quickly deliver their services, or how some open source companies are trying to reign in the free riders with the AGPL.
However, I believe the real impact of open source on SaaS will be because of it’s distribution model, not its development model. JBOSS’s sales machines is well known among open source businesses. They were tremendously successful monetizing their community which dramatically reduced the costs of sales and marketing. In fact, most professional open source companies benefit more from more efficient distribution than development.
So what’s this mean for SaaS companies? The successful ones are going to not only leverage the development model of open source, but perhaps more importantly, the distribution model as well. There are already examples of this. SugarCRM is perhaps the most prominent open source company that has extended their model to include SaaS. That’s the fastest growing part of their business and already represents about 30% of their total subscribers.
I believe this is an exciting evolution for both SaaS and open source businesses.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=208
EuroPython 2008, Day 3
Posted 13 July, 2008 by
mikeyp in
conference, Python, community
No Comments
Day 3 of EuroPython continued with the same intensity as the previous two
days. The talks I attended on Wednesday were:
- Jussi Rasinmaki’s talk on use the batteries included
A great ”Python in Action” advocacy talk on leveraging Python in a real
application for forest management.
- Beatrice During and Holger Krekel’s update on PyPy - Behind the scenes
This session was a summary of the current and planned PyPy activities and
roadmap. The part that really caught my attention (apart from the PyPy activity
itself) was the level of EU government s ponsorship and funding of open source
development. We really don’t see that in the US. Or, maybe I just haven’t been
looking….
- Mike Cariaso’s talk on Python in the Amazon Cloud
Mike is using Amazon EC2 with Python to perform compute intensive DNA analysis.
He has developed runblast to make EC2 more accessible to ordinary humans.
The talk also went into some cool details about SNPedia and promethease.
- Andreas Schreiber’s talk on DataFinder
Andreas described the problems at DLR with managing large scientific datasets.
They looked at commercial data management systems, and found them to be
expensive and top heavy with useless features.Also the tools uses proprietary
or unusable scripting functionality. As a result, they decided to build their
own. They develped a prototype in Java, but had problems with platform support
(write once, debug everywhere ? ) However, the users liked the embedded Jython
capabilities in the original protptype and requested a Python solution. This
resulted in a final implementation using Python. Key reasons for using Python
at DLR were : easy to learn, rapid development, inherently maintainable.
The last block of sessions for the day was the only time in the conference
wehre I really had trouble making a decision betwwen talks. I opted for Jack
Diederich’s Class Decorators: Radically Simple, which meant I missed Gasper
Zejn’s Managing Computing Clouds on Unreliable Nodes with Python
As for class decorators, it’s just one more reason for me to start using
Python 2.6 / 3.0. Jack wrote the reference implementation for PEP3129, and this
talk really did radically simplify the use cases for class decorators.
The Lightning talks for Wednesday covered a lot of ground and, as usual, I
learned a few new tricks.
One library I never knew about was Stefan Swarzer’s ftputil, which implements
a high level API for ftp. This is essentially a virtual filesystem which
implements os and os.path functionality.
There was also a lightning talk on using Restructured Text and docutils to
generate S5 presentations. I started using this tool chain a couple of months
ago, and I’m really beginning to like it. One text file can generate slides and
printed documentation which work in any web browser. It’s really useful for
notes, tutorials, and other basic presentations. This is not really the tool
for slick, animated, whiz bang stuff, but I rarely do those type of
presentations.
Inspired by Hans Rosling’s keynote, I also did a lightning talk on public data.
After three days, I have concluded that the EuroPython community is really not
all that different to the community in the US (despite comments I’ve heard to
the contrary.)
The EuroPython conference had fewer Django specific users than PyCon, but there
was definitely a surge in Django related attendance at PyCon 2008. That affect
might continue next year at Euro Python (which will be in Birmingham, UK.) On
the other hand, there were way more Zope/Plone developers at EuroPython. It’s
not clear to me whether this is because Zope is more popular in Europe, or
because there’s more overlap of the communities Europe. I suspect its a
combination of both.
EuroPython 2008 was definitely worth the trip.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=207
An ESB for the Web? Who Needs it?
Posted 11 July, 2008 by
Chris in
ESB, web services
1 Comment
Who needs it? Seriously.
I just read Jim Stogdill’s post on O’Reilly Rader that describes the new data portability hub Gnip as ‘An ESB for the Web’
It’s something that occurred to me as well when I first heard about it, but I quickly realized that, while it might be useful for some Data Portability transformations and social networking site integrations, it’s a long way from being an ESB for the web, or even useful outside of the social networking domain.

For starters, their protocol bridge doesn’t even mention SOAP (I’m not a big fan of SOAP, but lets be realistic. Every transactional application on the web today has a SOAP interface). From there, there are about 100 other things that make this seriously deficient as an ESB for the Web.
Which is perfectly fine for what they’re doing. They are working toward ‘Making Data Portability Suck Less’, which is something I really do think they can achieve with this bridge. But Jim, lets not get carried away and ascribe to them capabilities (not to mention a strategy) that they don’t seem interested in.
One thing that Jim points out that is dead-on is that web people and enterprise people think differently. Which explains why there’s no SOAP on the Gnip bridge, and why enterprise people still think they even need an ESB for the web. I could go on and dig into all the details, but it’s presented more clearly here than I ever could.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=206
SnapLogic for EC2 now available…
Posted 8 July, 2008 by
Chris in
PaaS, Cloud
No Comments
I’m happy to announce that we just release an updated version of SnapLogic for EC2.
Just like that, we’re a Cloud Computing Company. Seems like everyone, everywhere is a Cloud Computing Company. Heck, even The Onion knows this.
But seriously, this is a natural step for us. From the ground up, SnapLogic was built with the web in mind so making it available for use on Amazon’s Web Services (AWS) was a natural choice. AWS users are forward thinking and want to use the power of the web for everything they do. SnapLogic is attracting these kinds of users because everthing they’re doing to build scalable, reliable, secure web applications applies equally to their integrations with SnapLogic. Our re-usable RESTful components, web-friendly protocols and browser-based IDE fit perfectly with their PaaS and SaaS strategies.
So just as they look to hosted alternatives for applications, they’re increasingly looking for hosted alternatives for their platforms and infrastructure. SnapLogic for AWS is a perfect fit.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=205
EuroPython 2000, Day 2
Posted 8 July, 2008 by
mikeyp in
conference, Python, community
No Comments
Day 2 of EuroPython 2008 has been just as good as day 1.
Today, I attended:
-
Dinu Gherman’s talk on Visualising Relationships Between Python Objects.
This was a description of a prototype he’s working on for visualization of
Python object relationships. Based on GraphViz, it appears promising, and it
could have interesting applications for documenting and understanding code.
-
Peter Bulychev’s talk on Duplicate code detection using his Clone Digger.
Another interesting talk on a system for detecting duplicate code in Python
programs.
-
John Pinner’s talk on Conferencing systems in Python.
A good discussion on the support system for the EuroPython and PyCon UK
conferences. Conference organizing systems seem to be a endless source of
challenges for developers.
-
Raymond Hettinger’s Descriptor tutorial
A very good talk on the internals of Python descriptors.
Now, all your dots are belong to us.
-
Stefan Behnel’s introduction to the Cython compiler
Python to C, lots of uses for performance optimization and external library
interfacing.
On the unconference side of the conference , there was a good follow up to
Monday’s talk on Filesystem like API’s. Tommi Virtanen is leading this effot
to brainstorm some ideas about developing a cleaner file and filesystem
interface. The general idea is to collect some of the best ideas in path.py ,
twisted.filepath, various other libraries, and see if it’s possible to come up
with a new interface that would consolidate the best and cleanest
functionality, which currently needs multiple modules from multiple locations.
The next step will be a sprint later this week. There are some notes on the
wiki, but they’re probably not too useful unless you were in the session.
The Lightning talks for Tuesday have really good. I think
they were better than than the US PyCon lightning talks, and
EuroPython uses Swiss timing to boot!
The day wrapped up with an excellent and inspiring keynote by Hans Rosling
about the work he has done on Gapminder.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=204
EuroPython 2008 Day 1
Posted 8 July, 2008 by
mikeyp in
conference, Python, community
No Comments
I’m Vlinius at EuroPython 2008 this week. There are about 250 attendees,
which is somewhat bigger that last year.
This is already shaping up to be a good conference. The overall energy level
here is high. Everyone is enthusiastic about Python and the projects they work
on.
This is definitely a Python community event, and it has the comfortable
feeling of a unConference, while still being well organized. It’s a good
balance.
The overall feel is similar to PyCon, yet at the same time it’s subtly
different in a way that I can’t fully explain yet. It’s not just a
culture or USA / Europe thing - there’s something more to the difference.
So far, I have noticed more of a tendency to continually improve and
challenge the status quo. There is a strong spirit of innovation
and there have been many presentations about ideas and work in progress.
I fought off Jet Lag, and caught a full days of sessions for Monday.
So far, I attended:
- Marc-Andre Lemburg’s session on the Python database API
- Dinu Gherman’s talk on his work with Paragraphs in ReportLab
- Tommi Virtanen’s talk on Filesystem like API’s
- Ignas Mikalajunas’s talk on the benefits of eggs
- Christian Scholz’s talk on data portability
I had a 4pm talk on using SnapLogic to analyze Trac tickets, which was well
attended. There’s a lot of interest in what we’re doing, and plenty of
questions about what the practical applications are. The Trac integration
project I’m working on should help in that area.
Slides from my talk (and others) are on the EuroPython wiki talks page,
and also on the SnapLogic wiki
The timezones eventually caught up with me, and I missed Guido Von Rossum’s keynote at the end of the day.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=203
Connecting Clouds - Integration and Cloud Computing
Posted 27 June, 2008 by
mikeyp in
Cloud, SOA, ESB, EAI, ETL
1 Comment
I went to the excellent CloudCamp gathering in San Francisco Tuesday night.
CloudCamp was organized by Reuven Cohen, Jesse Silver, Dave Nielsen, and others, and there were
about 350 people there (SnapLogic also sponsored). The timing was good,
since it overlapped with a number of other conferences, which attracted a lot
of folks from out of town. Overall, I declare this a success, and I’m looking
forward to future CloudCamp events.
Cloud Computing is a broad concept, and that was reinforced by the nunber of
talks and discussions that seemed to always converge on the topic of ‘What is
cloud computing ?’
With an evolving concept like cloud computing, it’s hard to really nail down a
definition of what it is, since there really isn’t an it yet. (It reminds
me of the early days of the Internet, and the early days of the World Wide
Web…how could anyone put a definition on either of those in the early days?)
I’m not going to even attempt to define cloud computing. I’m simply treating it
as the general trend toward virtualization of computing resources. (Others have
done a good job of clarifying the space. See Peter Laird’s post on the cloud
market, or Reuven Cohen’s description.)
As of now, the cloud space can be divided into several broad market areas,
primarily:
- Software as a Service (Saas)
- Platform as a Services (PaaS)
- Infrastructure as a Service (IaaS)
There is a lot of current activity in all of these areas among vendors and
adopters. However, cloud computing will not happen overnight - it will be
a gradual transition and, current hype aside, that transition is ongoing in all
of these areas.
In the meantime, there are still a lot of existing software applications that
will not move to the ‘cloud’ any time soon. Looking at the trends, this raises
a lot of issues about how we are going to integrate all of these into a
cohesive functional unit. As a result, I chaired a session at CloudCamp to
promote some discussion on this topic, and we had 25 or 30 people in the room.
The session didn’t go as smoothly as I hoped, since there was one dissenting
voice, who insisted the solution to all these problems was local storage on the
client side. That aside, we did cover some good topics I’m summarizing here,
and adding some of my other thoughts.
The integration problem
Integration is a very broad term, that encompasses multiple levels, from low
level data interfaces to workflow and higher level business process automation.
Despite the various levels and interests in the room, there was general
consensus that integration is a real issue with cloud computing.
Some participants pointed that we haven’t fully solved the integration problem
locally. Despite the vision of SOA, and the existence of messaging, there’s
actually a lot of integration done today with Duct Tape and Paper Clips (or,
in hard currency, Perl and PL/SQL.) The larger enterprises have the
resources and skills to build their own integration capability, but smaller
business (the earliest adopters of SaaS), simply don’t have the infrastructure
in place. 65% of the enterprises surveyed by Forrested in late 2007 indicated
integration issues as their reason for not considering SaaS.
The reality is that integration matrix is still too complex, with it’s
permutation of protocols, access methods, and data formats.
Ray Wang of Forrester also notes this integration barrier is ‘more fallacy than
fact’, although I don’t agree with that latter statement - cloud raises some
significant new integration challenges, and they shouldn’t be underestimated.
Why the cloud makes integration harder
If we have difficulty with local integration, the cloud wave only makes integration
more difficult. There are a number of reasons why:
- New integration scenarios
- Access to the cloud may be limited
- Dynamic resources
- Performance
New integration scenarios
Before the cloud model, we had to tie local systems together. With the shift
to a cloud model, we now have to connect local applications to the cloud, and
we also have to connect cloud applications to each other, which adds new
permutations to the matrix.
Its unlikely that everything will move to a cloud model all at once, so even
the simplest scenarios require some form of local / remote integration. It’s
also likely that we will have applications that never leave the building, due
to regulatory constraints like HIPPA, GLBA, and general security and NPPI
issues. All of this means integration must cross a firewall somwehere.
Cloud to Cloud also raises issues. Do we rely on the (competing) vendors to
provide integration with each other ? Where is the integration hosted ? Does
the integration live in the cloud as well? And if so, how does it connect to
those local applications ?
Access may be limited
Access to cloud resources, either SaaS, PaaS, or pure infrastructure, is more
limited than local applications.
With local applications, we usually had complete access to the application,
even when there were no good integration points in the original design. With
custom applications, adding integration hooks was possible. Even with
commercial applications, it was always possible to slip in database triggers to
raise events and provide hooks for integration access.
Once applications move to the cloud, custom applications must be designed to
support integration because we no longer have that low level of access. For
SaaS, we are dependent on the vendor to provide the integration hooks and
API’s. For example, the SalesForce.com Web Services API doesn’t support
transactions against multiple records, which means integration code has to
handle that logic. For PaaS, the platform might support integration for
aplications on the platform. Platform to Platform is still an open question.
Some would argue that a limited set of APIs will improve the situation, since
backdoor access is what led integration into the ‘Duct Tape’ mess in the first
place. They have a valid point. But those API’s must be able to handle the
integration required, and the popularity of BeautifulSoup tells me screen
scraping as a workaround is alive and well.
Dynamic Resources
The true cloud model abstracts away most of our notions of physical hardware,
and everything becomes a service in some form. This is one of the benefits of
cloud, but it also means integration models change. In a world where
applications and infrastructure move and change dynamically, traditional
notions of tightly coupled integration are no longer valid. Add to this the
issues of application versioning (no longer under our control in a SaaS
environment), or PaaS platform changes (also no longer under our control) and
tight coupling becomes a dead end.
It clear from the SaaS vendors that the Web is the way to go when it comes to
client access. It seems that lower level interfaces will follow the
same REST route.
Performance
Cloud or not, we still can’t get away from physical limitations. Although we
may see better application scalability and performance, the network distances
between elements in the cloud are no longer under our control. Bandwidth isn’t
the limiting factor in most integration scenarios, round trip latency is. On
the LAN, we can optimize. In the cloud, we lose that ability, and have to live
with longer latency in combination with SLA’s from multiple vendors.
These performance constraints change the integration model. Integration can no
longer depend on high performance local access - anyone that does complex
analysis on, say, SalesForce leads, or anyone trying to pull SaaS data into
a warehouse will tell you performance is already an issue.
What this means for integration
There are a lot of implications as a result of the shift to a cloud model, and
I originally hoped we could get into the deeper issues during CloudCamp, but
time wasn’t on our side.
Cloud vendors understand that integration is an issue, particularly at the
application and platform level. We are increasingly seeing both
integration as a service, and integration as part of the platform.
The question is what form will that integration capability take ? Will it be
open and standards based like Open Social? Or will it be vendor and platform specific ?
Theres a lot at stake here,
because there a strong possibility that integration is becoming part of the
vendor lock-in strategy.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=202
What kind of service guarantees can I expect from PaaS?
Posted 14 June, 2008 by
Chris in
PaaS, web services
1 Comment
Peter Laird wrote up a great analysis on Terms of Service for SaaS providers and it got me wondering: What about the emerging Platform as a Service providers?
They’ve got an especially difficult challenge because they’re running other people’s code. How do you ensure availability when it’s not your code? Google’s App Engine conspicuously avoids the problem by not providing any SLA.
Generally speaking, there seems be three categories of providers:
Infrastructure, managed hosting and run time environments: OpSource, Etelos, Joyent, AWS, GAE, etc
Cloud IDEs: Bungee Labs, etc.
App Builders: CogHead, LongJump, etc.
Obviously, the closer you get to bare metal, the harder it is for you to provide service guarantees.
I’ve already noted that AWS provides for 99.9% in their SLA, but that’s only for the service itself. Nowhere in that document does it say anything about your image’s uptime. Seems that you are on your own entirely.
This is a tricky problem even for the app builders. They all support some kind of scripting environment and you can only sandbox so much. Not to pick on CogHead, but I found this post on their forum. So, as you can see, these problems are going to show up everywhere.
I’d love to hear from anyone that has more details on SLA from platform providers.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=201
Hi-fiving failure and a lesson on how not to treat your users…
Posted 10 June, 2008 by
Chris in
Cloud
2 Comments
I saw today via Techmeme that Twitter was excited and proud that they were able to achieve 97.3% uptime during the Apple WWDC keynote yesterday.
If I were them, I’d be a more humble and a little more circumspect. Reading thought he comments, you’d think they just landed a man on the moon.
First, let me say that under many circumstances achieving 97.3% availability is grounds for termination. Most Enterprise SLAs specify 99.9% or more with service credits applied for failure. Amazon’s SLA provides for 99.9% with 10% credited back if they fall below that and 25% credit below 99%.
Salesforce.com had some serious trouble with availability a while back and people were legitimately wondering if they would survive the crisis. Today, they make a provision for these SLA failure expenses, and so far, have been lucky enough (i.e. smart enough) not to have to make any payments.
Just to put that in perspective, 99.9% uptime translates to about 44 minutes downtime per month (99%, about 440). So, at 97.3% for the (roughly) four hours of peak time usage during the keynote the were unavailable for about 6.5 mins. or nearly 15% of the downtime budget for the month.
This isn’t’ something I would be proud of.
Second, your users are not your QA or test engineering department. They claim:
…we learned a lot during this stress test and that will translate to better performance down the line.
Finally, turning off features to support peak loads is treating the symptoms, not the underlying problem.
Is it any wonder their site is as unreliable as it is? With this kind of attitude, I don’t think things are going to materially change anytime soon.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=200
Integration remains the number one inhibitor to adopting SaaS…
Posted 9 June, 2008 by
Chris in
SnapLogic, SOA, Open Source
No Comments
CIO magizine reports on a recent survey by Forrester indicates that integration remains the number one inhibitor for Enterprises adopting SaaS.
This comes as no surprise to me.
We’ve been talking to all the large hosted application providers and they all say the same thing: My subscribers have data inside their organization and they need to get it out. The larger the company, the more likely they need to integrate their SaaS application with behind the firewall data and other applications. When integrating that data becomes more difficult (or less secure, or available) the extra integration complexity is traded off against the simplicity of a SaaS alternative. Once the complexity gets too large, it can quickly overcome the benefits of SaaS.
Integrating SaaS with internal apps remains problematic since access generally requires software that runs inside the organization to manipulate and transform the data and to initiate integration with the SaaS apps for higher security.
There are a number of hosted integration alternatives emerging, but few adequately address the behind the firewall issue. The ones do, suffer from either being too complex for the SaaS marketplace, or too limited in function to handle the task.
We here at SnapLogic have been working on the problem for a while now. We think we’ve got the right approach to this problem. We’ve got open source software that you can download and run wherever the problem lives. Our distributed approach allows you to apply the necessary access and/or transformation functions wherever it’s needed, and deploy in on top of your existing infrastructure, weather that in your data center, in the DMZ, or in a co-location facility. And since it works just like the web so you already know how it works to keep it running fast and secure.
Of all the possible deployment scenarios for SnapLogic, integration SaaS applications with enterprise apps is by far the most common.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=199
« Previous
Posted 18 July, 2008 by Chris in SaaS, SugarCRM
1 Comment
Sarah Lacy’s article in Business Week is a sobering reminder of the challenges software companies face building on-demand businesses. The point made vividly clear here:
“SAP thought customers would go to a Web site, configure it themselves, and found the first hundred or so implementations required a lot of time and a lot of tremendous costs,” Richardson says. “Small businesses are calling for support, calling SAP because they don’t have IT departments. SAP is spending a lot of resources to configure and troubleshoot the problem.”
Nick Carr chimes in here with the point:
Anyone who thinks the software-as-a-service business is a gold mine is wrong. The economics are fundamentally different from those of the traditional software business - and not in a good way
I spoke yesterday to a SaaS platform provider who’s business is to help ISVs deliver their solutions as a service. They described some of the challenges their customers face being multi-tenancy, scalability, security, APIs, not to mention simply cramming the features into a browser.
I’ve watched other companies launch on-demand offerings with great fan fare only to find later that the offering languished among their traditionally delivered alternative. Then, after facing the brutal reality of the SaaS model purge all references to the SaaS offering (don’t they know there’s a Wayback Machine?).
As Sarah notes, there’ s no putting the genie back in the bottle. So what’s a company to do?
One point that Sara mentions but doesn’t fully explore is the role of open source in SaaS. There are many dimensions to this including how SaaS companies have leveraged open source to more quickly deliver their services, or how some open source companies are trying to reign in the free riders with the AGPL.
However, I believe the real impact of open source on SaaS will be because of it’s distribution model, not its development model. JBOSS’s sales machines is well known among open source businesses. They were tremendously successful monetizing their community which dramatically reduced the costs of sales and marketing. In fact, most professional open source companies benefit more from more efficient distribution than development.
So what’s this mean for SaaS companies? The successful ones are going to not only leverage the development model of open source, but perhaps more importantly, the distribution model as well. There are already examples of this. SugarCRM is perhaps the most prominent open source company that has extended their model to include SaaS. That’s the fastest growing part of their business and already represents about 30% of their total subscribers.
I believe this is an exciting evolution for both SaaS and open source businesses.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=208
EuroPython 2008, Day 3
Posted 13 July, 2008 by
mikeyp in
conference, Python, community
No Comments
Day 3 of EuroPython continued with the same intensity as the previous two
days. The talks I attended on Wednesday were:
- Jussi Rasinmaki’s talk on use the batteries included
A great ”Python in Action” advocacy talk on leveraging Python in a real
application for forest management.
- Beatrice During and Holger Krekel’s update on PyPy - Behind the scenes
This session was a summary of the current and planned PyPy activities and
roadmap. The part that really caught my attention (apart from the PyPy activity
itself) was the level of EU government s ponsorship and funding of open source
development. We really don’t see that in the US. Or, maybe I just haven’t been
looking….
- Mike Cariaso’s talk on Python in the Amazon Cloud
Mike is using Amazon EC2 with Python to perform compute intensive DNA analysis.
He has developed runblast to make EC2 more accessible to ordinary humans.
The talk also went into some cool details about SNPedia and promethease.
- Andreas Schreiber’s talk on DataFinder
Andreas described the problems at DLR with managing large scientific datasets.
They looked at commercial data management systems, and found them to be
expensive and top heavy with useless features.Also the tools uses proprietary
or unusable scripting functionality. As a result, they decided to build their
own. They develped a prototype in Java, but had problems with platform support
(write once, debug everywhere ? ) However, the users liked the embedded Jython
capabilities in the original protptype and requested a Python solution. This
resulted in a final implementation using Python. Key reasons for using Python
at DLR were : easy to learn, rapid development, inherently maintainable.
The last block of sessions for the day was the only time in the conference
wehre I really had trouble making a decision betwwen talks. I opted for Jack
Diederich’s Class Decorators: Radically Simple, which meant I missed Gasper
Zejn’s Managing Computing Clouds on Unreliable Nodes with Python
As for class decorators, it’s just one more reason for me to start using
Python 2.6 / 3.0. Jack wrote the reference implementation for PEP3129, and this
talk really did radically simplify the use cases for class decorators.
The Lightning talks for Wednesday covered a lot of ground and, as usual, I
learned a few new tricks.
One library I never knew about was Stefan Swarzer’s ftputil, which implements
a high level API for ftp. This is essentially a virtual filesystem which
implements os and os.path functionality.
There was also a lightning talk on using Restructured Text and docutils to
generate S5 presentations. I started using this tool chain a couple of months
ago, and I’m really beginning to like it. One text file can generate slides and
printed documentation which work in any web browser. It’s really useful for
notes, tutorials, and other basic presentations. This is not really the tool
for slick, animated, whiz bang stuff, but I rarely do those type of
presentations.
Inspired by Hans Rosling’s keynote, I also did a lightning talk on public data.
After three days, I have concluded that the EuroPython community is really not
all that different to the community in the US (despite comments I’ve heard to
the contrary.)
The EuroPython conference had fewer Django specific users than PyCon, but there
was definitely a surge in Django related attendance at PyCon 2008. That affect
might continue next year at Euro Python (which will be in Birmingham, UK.) On
the other hand, there were way more Zope/Plone developers at EuroPython. It’s
not clear to me whether this is because Zope is more popular in Europe, or
because there’s more overlap of the communities Europe. I suspect its a
combination of both.
EuroPython 2008 was definitely worth the trip.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=207
An ESB for the Web? Who Needs it?
Posted 11 July, 2008 by
Chris in
ESB, web services
1 Comment
Who needs it? Seriously.
I just read Jim Stogdill’s post on O’Reilly Rader that describes the new data portability hub Gnip as ‘An ESB for the Web’
It’s something that occurred to me as well when I first heard about it, but I quickly realized that, while it might be useful for some Data Portability transformations and social networking site integrations, it’s a long way from being an ESB for the web, or even useful outside of the social networking domain.

For starters, their protocol bridge doesn’t even mention SOAP (I’m not a big fan of SOAP, but lets be realistic. Every transactional application on the web today has a SOAP interface). From there, there are about 100 other things that make this seriously deficient as an ESB for the Web.
Which is perfectly fine for what they’re doing. They are working toward ‘Making Data Portability Suck Less’, which is something I really do think they can achieve with this bridge. But Jim, lets not get carried away and ascribe to them capabilities (not to mention a strategy) that they don’t seem interested in.
One thing that Jim points out that is dead-on is that web people and enterprise people think differently. Which explains why there’s no SOAP on the Gnip bridge, and why enterprise people still think they even need an ESB for the web. I could go on and dig into all the details, but it’s presented more clearly here than I ever could.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=206
SnapLogic for EC2 now available…
Posted 8 July, 2008 by
Chris in
PaaS, Cloud
No Comments
I’m happy to announce that we just release an updated version of SnapLogic for EC2.
Just like that, we’re a Cloud Computing Company. Seems like everyone, everywhere is a Cloud Computing Company. Heck, even The Onion knows this.
But seriously, this is a natural step for us. From the ground up, SnapLogic was built with the web in mind so making it available for use on Amazon’s Web Services (AWS) was a natural choice. AWS users are forward thinking and want to use the power of the web for everything they do. SnapLogic is attracting these kinds of users because everthing they’re doing to build scalable, reliable, secure web applications applies equally to their integrations with SnapLogic. Our re-usable RESTful components, web-friendly protocols and browser-based IDE fit perfectly with their PaaS and SaaS strategies.
So just as they look to hosted alternatives for applications, they’re increasingly looking for hosted alternatives for their platforms and infrastructure. SnapLogic for AWS is a perfect fit.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=205
EuroPython 2000, Day 2
Posted 8 July, 2008 by
mikeyp in
conference, Python, community
No Comments
Day 2 of EuroPython 2008 has been just as good as day 1.
Today, I attended:
-
Dinu Gherman’s talk on Visualising Relationships Between Python Objects.
This was a description of a prototype he’s working on for visualization of
Python object relationships. Based on GraphViz, it appears promising, and it
could have interesting applications for documenting and understanding code.
-
Peter Bulychev’s talk on Duplicate code detection using his Clone Digger.
Another interesting talk on a system for detecting duplicate code in Python
programs.
-
John Pinner’s talk on Conferencing systems in Python.
A good discussion on the support system for the EuroPython and PyCon UK
conferences. Conference organizing systems seem to be a endless source of
challenges for developers.
-
Raymond Hettinger’s Descriptor tutorial
A very good talk on the internals of Python descriptors.
Now, all your dots are belong to us.
-
Stefan Behnel’s introduction to the Cython compiler
Python to C, lots of uses for performance optimization and external library
interfacing.
On the unconference side of the conference , there was a good follow up to
Monday’s talk on Filesystem like API’s. Tommi Virtanen is leading this effot
to brainstorm some ideas about developing a cleaner file and filesystem
interface. The general idea is to collect some of the best ideas in path.py ,
twisted.filepath, various other libraries, and see if it’s possible to come up
with a new interface that would consolidate the best and cleanest
functionality, which currently needs multiple modules from multiple locations.
The next step will be a sprint later this week. There are some notes on the
wiki, but they’re probably not too useful unless you were in the session.
The Lightning talks for Tuesday have really good. I think
they were better than than the US PyCon lightning talks, and
EuroPython uses Swiss timing to boot!
The day wrapped up with an excellent and inspiring keynote by Hans Rosling
about the work he has done on Gapminder.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=204
EuroPython 2008 Day 1
Posted 8 July, 2008 by
mikeyp in
conference, Python, community
No Comments
I’m Vlinius at EuroPython 2008 this week. There are about 250 attendees,
which is somewhat bigger that last year.
This is already shaping up to be a good conference. The overall energy level
here is high. Everyone is enthusiastic about Python and the projects they work
on.
This is definitely a Python community event, and it has the comfortable
feeling of a unConference, while still being well organized. It’s a good
balance.
The overall feel is similar to PyCon, yet at the same time it’s subtly
different in a way that I can’t fully explain yet. It’s not just a
culture or USA / Europe thing - there’s something more to the difference.
So far, I have noticed more of a tendency to continually improve and
challenge the status quo. There is a strong spirit of innovation
and there have been many presentations about ideas and work in progress.
I fought off Jet Lag, and caught a full days of sessions for Monday.
So far, I attended:
- Marc-Andre Lemburg’s session on the Python database API
- Dinu Gherman’s talk on his work with Paragraphs in ReportLab
- Tommi Virtanen’s talk on Filesystem like API’s
- Ignas Mikalajunas’s talk on the benefits of eggs
- Christian Scholz’s talk on data portability
I had a 4pm talk on using SnapLogic to analyze Trac tickets, which was well
attended. There’s a lot of interest in what we’re doing, and plenty of
questions about what the practical applications are. The Trac integration
project I’m working on should help in that area.
Slides from my talk (and others) are on the EuroPython wiki talks page,
and also on the SnapLogic wiki
The timezones eventually caught up with me, and I missed Guido Von Rossum’s keynote at the end of the day.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=203
Connecting Clouds - Integration and Cloud Computing
Posted 27 June, 2008 by
mikeyp in
Cloud, SOA, ESB, EAI, ETL
1 Comment
I went to the excellent CloudCamp gathering in San Francisco Tuesday night.
CloudCamp was organized by Reuven Cohen, Jesse Silver, Dave Nielsen, and others, and there were
about 350 people there (SnapLogic also sponsored). The timing was good,
since it overlapped with a number of other conferences, which attracted a lot
of folks from out of town. Overall, I declare this a success, and I’m looking
forward to future CloudCamp events.
Cloud Computing is a broad concept, and that was reinforced by the nunber of
talks and discussions that seemed to always converge on the topic of ‘What is
cloud computing ?’
With an evolving concept like cloud computing, it’s hard to really nail down a
definition of what it is, since there really isn’t an it yet. (It reminds
me of the early days of the Internet, and the early days of the World Wide
Web…how could anyone put a definition on either of those in the early days?)
I’m not going to even attempt to define cloud computing. I’m simply treating it
as the general trend toward virtualization of computing resources. (Others have
done a good job of clarifying the space. See Peter Laird’s post on the cloud
market, or Reuven Cohen’s description.)
As of now, the cloud space can be divided into several broad market areas,
primarily:
- Software as a Service (Saas)
- Platform as a Services (PaaS)
- Infrastructure as a Service (IaaS)
There is a lot of current activity in all of these areas among vendors and
adopters. However, cloud computing will not happen overnight - it will be
a gradual transition and, current hype aside, that transition is ongoing in all
of these areas.
In the meantime, there are still a lot of existing software applications that
will not move to the ‘cloud’ any time soon. Looking at the trends, this raises
a lot of issues about how we are going to integrate all of these into a
cohesive functional unit. As a result, I chaired a session at CloudCamp to
promote some discussion on this topic, and we had 25 or 30 people in the room.
The session didn’t go as smoothly as I hoped, since there was one dissenting
voice, who insisted the solution to all these problems was local storage on the
client side. That aside, we did cover some good topics I’m summarizing here,
and adding some of my other thoughts.
The integration problem
Integration is a very broad term, that encompasses multiple levels, from low
level data interfaces to workflow and higher level business process automation.
Despite the various levels and interests in the room, there was general
consensus that integration is a real issue with cloud computing.
Some participants pointed that we haven’t fully solved the integration problem
locally. Despite the vision of SOA, and the existence of messaging, there’s
actually a lot of integration done today with Duct Tape and Paper Clips (or,
in hard currency, Perl and PL/SQL.) The larger enterprises have the
resources and skills to build their own integration capability, but smaller
business (the earliest adopters of SaaS), simply don’t have the infrastructure
in place. 65% of the enterprises surveyed by Forrested in late 2007 indicated
integration issues as their reason for not considering SaaS.
The reality is that integration matrix is still too complex, with it’s
permutation of protocols, access methods, and data formats.
Ray Wang of Forrester also notes this integration barrier is ‘more fallacy than
fact’, although I don’t agree with that latter statement - cloud raises some
significant new integration challenges, and they shouldn’t be underestimated.
Why the cloud makes integration harder
If we have difficulty with local integration, the cloud wave only makes integration
more difficult. There are a number of reasons why:
- New integration scenarios
- Access to the cloud may be limited
- Dynamic resources
- Performance
New integration scenarios
Before the cloud model, we had to tie local systems together. With the shift
to a cloud model, we now have to connect local applications to the cloud, and
we also have to connect cloud applications to each other, which adds new
permutations to the matrix.
Its unlikely that everything will move to a cloud model all at once, so even
the simplest scenarios require some form of local / remote integration. It’s
also likely that we will have applications that never leave the building, due
to regulatory constraints like HIPPA, GLBA, and general security and NPPI
issues. All of this means integration must cross a firewall somwehere.
Cloud to Cloud also raises issues. Do we rely on the (competing) vendors to
provide integration with each other ? Where is the integration hosted ? Does
the integration live in the cloud as well? And if so, how does it connect to
those local applications ?
Access may be limited
Access to cloud resources, either SaaS, PaaS, or pure infrastructure, is more
limited than local applications.
With local applications, we usually had complete access to the application,
even when there were no good integration points in the original design. With
custom applications, adding integration hooks was possible. Even with
commercial applications, it was always possible to slip in database triggers to
raise events and provide hooks for integration access.
Once applications move to the cloud, custom applications must be designed to
support integration because we no longer have that low level of access. For
SaaS, we are dependent on the vendor to provide the integration hooks and
API’s. For example, the SalesForce.com Web Services API doesn’t support
transactions against multiple records, which means integration code has to
handle that logic. For PaaS, the platform might support integration for
aplications on the platform. Platform to Platform is still an open question.
Some would argue that a limited set of APIs will improve the situation, since
backdoor access is what led integration into the ‘Duct Tape’ mess in the first
place. They have a valid point. But those API’s must be able to handle the
integration required, and the popularity of BeautifulSoup tells me screen
scraping as a workaround is alive and well.
Dynamic Resources
The true cloud model abstracts away most of our notions of physical hardware,
and everything becomes a service in some form. This is one of the benefits of
cloud, but it also means integration models change. In a world where
applications and infrastructure move and change dynamically, traditional
notions of tightly coupled integration are no longer valid. Add to this the
issues of application versioning (no longer under our control in a SaaS
environment), or PaaS platform changes (also no longer under our control) and
tight coupling becomes a dead end.
It clear from the SaaS vendors that the Web is the way to go when it comes to
client access. It seems that lower level interfaces will follow the
same REST route.
Performance
Cloud or not, we still can’t get away from physical limitations. Although we
may see better application scalability and performance, the network distances
between elements in the cloud are no longer under our control. Bandwidth isn’t
the limiting factor in most integration scenarios, round trip latency is. On
the LAN, we can optimize. In the cloud, we lose that ability, and have to live
with longer latency in combination with SLA’s from multiple vendors.
These performance constraints change the integration model. Integration can no
longer depend on high performance local access - anyone that does complex
analysis on, say, SalesForce leads, or anyone trying to pull SaaS data into
a warehouse will tell you performance is already an issue.
What this means for integration
There are a lot of implications as a result of the shift to a cloud model, and
I originally hoped we could get into the deeper issues during CloudCamp, but
time wasn’t on our side.
Cloud vendors understand that integration is an issue, particularly at the
application and platform level. We are increasingly seeing both
integration as a service, and integration as part of the platform.
The question is what form will that integration capability take ? Will it be
open and standards based like Open Social? Or will it be vendor and platform specific ?
Theres a lot at stake here,
because there a strong possibility that integration is becoming part of the
vendor lock-in strategy.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=202
What kind of service guarantees can I expect from PaaS?
Posted 14 June, 2008 by
Chris in
PaaS, web services
1 Comment
Peter Laird wrote up a great analysis on Terms of Service for SaaS providers and it got me wondering: What about the emerging Platform as a Service providers?
They’ve got an especially difficult challenge because they’re running other people’s code. How do you ensure availability when it’s not your code? Google’s App Engine conspicuously avoids the problem by not providing any SLA.
Generally speaking, there seems be three categories of providers:
Infrastructure, managed hosting and run time environments: OpSource, Etelos, Joyent, AWS, GAE, etc
Cloud IDEs: Bungee Labs, etc.
App Builders: CogHead, LongJump, etc.
Obviously, the closer you get to bare metal, the harder it is for you to provide service guarantees.
I’ve already noted that AWS provides for 99.9% in their SLA, but that’s only for the service itself. Nowhere in that document does it say anything about your image’s uptime. Seems that you are on your own entirely.
This is a tricky problem even for the app builders. They all support some kind of scripting environment and you can only sandbox so much. Not to pick on CogHead, but I found this post on their forum. So, as you can see, these problems are going to show up everywhere.
I’d love to hear from anyone that has more details on SLA from platform providers.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=201
Hi-fiving failure and a lesson on how not to treat your users…
Posted 10 June, 2008 by
Chris in
Cloud
2 Comments
I saw today via Techmeme that Twitter was excited and proud that they were able to achieve 97.3% uptime during the Apple WWDC keynote yesterday.
If I were them, I’d be a more humble and a little more circumspect. Reading thought he comments, you’d think they just landed a man on the moon.
First, let me say that under many circumstances achieving 97.3% availability is grounds for termination. Most Enterprise SLAs specify 99.9% or more with service credits applied for failure. Amazon’s SLA provides for 99.9% with 10% credited back if they fall below that and 25% credit below 99%.
Salesforce.com had some serious trouble with availability a while back and people were legitimately wondering if they would survive the crisis. Today, they make a provision for these SLA failure expenses, and so far, have been lucky enough (i.e. smart enough) not to have to make any payments.
Just to put that in perspective, 99.9% uptime translates to about 44 minutes downtime per month (99%, about 440). So, at 97.3% for the (roughly) four hours of peak time usage during the keynote the were unavailable for about 6.5 mins. or nearly 15% of the downtime budget for the month.
This isn’t’ something I would be proud of.
Second, your users are not your QA or test engineering department. They claim:
…we learned a lot during this stress test and that will translate to better performance down the line.
Finally, turning off features to support peak loads is treating the symptoms, not the underlying problem.
Is it any wonder their site is as unreliable as it is? With this kind of attitude, I don’t think things are going to materially change anytime soon.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=200
Integration remains the number one inhibitor to adopting SaaS…
Posted 9 June, 2008 by
Chris in
SnapLogic, SOA, Open Source
No Comments
CIO magizine reports on a recent survey by Forrester indicates that integration remains the number one inhibitor for Enterprises adopting SaaS.
This comes as no surprise to me.
We’ve been talking to all the large hosted application providers and they all say the same thing: My subscribers have data inside their organization and they need to get it out. The larger the company, the more likely they need to integrate their SaaS application with behind the firewall data and other applications. When integrating that data becomes more difficult (or less secure, or available) the extra integration complexity is traded off against the simplicity of a SaaS alternative. Once the complexity gets too large, it can quickly overcome the benefits of SaaS.
Integrating SaaS with internal apps remains problematic since access generally requires software that runs inside the organization to manipulate and transform the data and to initiate integration with the SaaS apps for higher security.
There are a number of hosted integration alternatives emerging, but few adequately address the behind the firewall issue. The ones do, suffer from either being too complex for the SaaS marketplace, or too limited in function to handle the task.
We here at SnapLogic have been working on the problem for a while now. We think we’ve got the right approach to this problem. We’ve got open source software that you can download and run wherever the problem lives. Our distributed approach allows you to apply the necessary access and/or transformation functions wherever it’s needed, and deploy in on top of your existing infrastructure, weather that in your data center, in the DMZ, or in a co-location facility. And since it works just like the web so you already know how it works to keep it running fast and secure.
Of all the possible deployment scenarios for SnapLogic, integration SaaS applications with enterprise apps is by far the most common.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=199
« Previous
Posted 13 July, 2008 by mikeyp in conference, Python, community
No Comments
Day 3 of EuroPython continued with the same intensity as the previous two days. The talks I attended on Wednesday were:
- Jussi Rasinmaki’s talk on use the batteries included
A great ”Python in Action” advocacy talk on leveraging Python in a real application for forest management.
- Beatrice During and Holger Krekel’s update on PyPy - Behind the scenes
This session was a summary of the current and planned PyPy activities and roadmap. The part that really caught my attention (apart from the PyPy activity itself) was the level of EU government s ponsorship and funding of open source development. We really don’t see that in the US. Or, maybe I just haven’t been looking….
- Mike Cariaso’s talk on Python in the Amazon Cloud
Mike is using Amazon EC2 with Python to perform compute intensive DNA analysis. He has developed runblast to make EC2 more accessible to ordinary humans. The talk also went into some cool details about SNPedia and promethease.
- Andreas Schreiber’s talk on DataFinder
Andreas described the problems at DLR with managing large scientific datasets. They looked at commercial data management systems, and found them to be expensive and top heavy with useless features.Also the tools uses proprietary or unusable scripting functionality. As a result, they decided to build their own. They develped a prototype in Java, but had problems with platform support (write once, debug everywhere ? ) However, the users liked the embedded Jython capabilities in the original protptype and requested a Python solution. This resulted in a final implementation using Python. Key reasons for using Python at DLR were : easy to learn, rapid development, inherently maintainable.
The last block of sessions for the day was the only time in the conference wehre I really had trouble making a decision betwwen talks. I opted for Jack Diederich’s Class Decorators: Radically Simple, which meant I missed Gasper Zejn’s Managing Computing Clouds on Unreliable Nodes with Python
As for class decorators, it’s just one more reason for me to start using Python 2.6 / 3.0. Jack wrote the reference implementation for PEP3129, and this talk really did radically simplify the use cases for class decorators.
The Lightning talks for Wednesday covered a lot of ground and, as usual, I learned a few new tricks.
One library I never knew about was Stefan Swarzer’s ftputil, which implements a high level API for ftp. This is essentially a virtual filesystem which implements os and os.path functionality.
There was also a lightning talk on using Restructured Text and docutils to generate S5 presentations. I started using this tool chain a couple of months ago, and I’m really beginning to like it. One text file can generate slides and printed documentation which work in any web browser. It’s really useful for notes, tutorials, and other basic presentations. This is not really the tool for slick, animated, whiz bang stuff, but I rarely do those type of presentations.
Inspired by Hans Rosling’s keynote, I also did a lightning talk on public data.
After three days, I have concluded that the EuroPython community is really not all that different to the community in the US (despite comments I’ve heard to the contrary.)
The EuroPython conference had fewer Django specific users than PyCon, but there was definitely a surge in Django related attendance at PyCon 2008. That affect might continue next year at Euro Python (which will be in Birmingham, UK.) On the other hand, there were way more Zope/Plone developers at EuroPython. It’s not clear to me whether this is because Zope is more popular in Europe, or because there’s more overlap of the communities Europe. I suspect its a combination of both.
EuroPython 2008 was definitely worth the trip.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=207
An ESB for the Web? Who Needs it?
Posted 11 July, 2008 by
Chris in
ESB, web services
1 Comment
Who needs it? Seriously.
I just read Jim Stogdill’s post on O’Reilly Rader that describes the new data portability hub Gnip as ‘An ESB for the Web’
It’s something that occurred to me as well when I first heard about it, but I quickly realized that, while it might be useful for some Data Portability transformations and social networking site integrations, it’s a long way from being an ESB for the web, or even useful outside of the social networking domain.

For starters, their protocol bridge doesn’t even mention SOAP (I’m not a big fan of SOAP, but lets be realistic. Every transactional application on the web today has a SOAP interface). From there, there are about 100 other things that make this seriously deficient as an ESB for the Web.
Which is perfectly fine for what they’re doing. They are working toward ‘Making Data Portability Suck Less’, which is something I really do think they can achieve with this bridge. But Jim, lets not get carried away and ascribe to them capabilities (not to mention a strategy) that they don’t seem interested in.
One thing that Jim points out that is dead-on is that web people and enterprise people think differently. Which explains why there’s no SOAP on the Gnip bridge, and why enterprise people still think they even need an ESB for the web. I could go on and dig into all the details, but it’s presented more clearly here than I ever could.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=206
SnapLogic for EC2 now available…
Posted 8 July, 2008 by
Chris in
PaaS, Cloud
No Comments
I’m happy to announce that we just release an updated version of SnapLogic for EC2.
Just like that, we’re a Cloud Computing Company. Seems like everyone, everywhere is a Cloud Computing Company. Heck, even The Onion knows this.
But seriously, this is a natural step for us. From the ground up, SnapLogic was built with the web in mind so making it available for use on Amazon’s Web Services (AWS) was a natural choice. AWS users are forward thinking and want to use the power of the web for everything they do. SnapLogic is attracting these kinds of users because everthing they’re doing to build scalable, reliable, secure web applications applies equally to their integrations with SnapLogic. Our re-usable RESTful components, web-friendly protocols and browser-based IDE fit perfectly with their PaaS and SaaS strategies.
So just as they look to hosted alternatives for applications, they’re increasingly looking for hosted alternatives for their platforms and infrastructure. SnapLogic for AWS is a perfect fit.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=205
EuroPython 2000, Day 2
Posted 8 July, 2008 by
mikeyp in
conference, Python, community
No Comments
Day 2 of EuroPython 2008 has been just as good as day 1.
Today, I attended:
-
Dinu Gherman’s talk on Visualising Relationships Between Python Objects.
This was a description of a prototype he’s working on for visualization of
Python object relationships. Based on GraphViz, it appears promising, and it
could have interesting applications for documenting and understanding code.
-
Peter Bulychev’s talk on Duplicate code detection using his Clone Digger.
Another interesting talk on a system for detecting duplicate code in Python
programs.
-
John Pinner’s talk on Conferencing systems in Python.
A good discussion on the support system for the EuroPython and PyCon UK
conferences. Conference organizing systems seem to be a endless source of
challenges for developers.
-
Raymond Hettinger’s Descriptor tutorial
A very good talk on the internals of Python descriptors.
Now, all your dots are belong to us.
-
Stefan Behnel’s introduction to the Cython compiler
Python to C, lots of uses for performance optimization and external library
interfacing.
On the unconference side of the conference , there was a good follow up to
Monday’s talk on Filesystem like API’s. Tommi Virtanen is leading this effot
to brainstorm some ideas about developing a cleaner file and filesystem
interface. The general idea is to collect some of the best ideas in path.py ,
twisted.filepath, various other libraries, and see if it’s possible to come up
with a new interface that would consolidate the best and cleanest
functionality, which currently needs multiple modules from multiple locations.
The next step will be a sprint later this week. There are some notes on the
wiki, but they’re probably not too useful unless you were in the session.
The Lightning talks for Tuesday have really good. I think
they were better than than the US PyCon lightning talks, and
EuroPython uses Swiss timing to boot!
The day wrapped up with an excellent and inspiring keynote by Hans Rosling
about the work he has done on Gapminder.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=204
EuroPython 2008 Day 1
Posted 8 July, 2008 by
mikeyp in
conference, Python, community
No Comments
I’m Vlinius at EuroPython 2008 this week. There are about 250 attendees,
which is somewhat bigger that last year.
This is already shaping up to be a good conference. The overall energy level
here is high. Everyone is enthusiastic about Python and the projects they work
on.
This is definitely a Python community event, and it has the comfortable
feeling of a unConference, while still being well organized. It’s a good
balance.
The overall feel is similar to PyCon, yet at the same time it’s subtly
different in a way that I can’t fully explain yet. It’s not just a
culture or USA / Europe thing - there’s something more to the difference.
So far, I have noticed more of a tendency to continually improve and
challenge the status quo. There is a strong spirit of innovation
and there have been many presentations about ideas and work in progress.
I fought off Jet Lag, and caught a full days of sessions for Monday.
So far, I attended:
- Marc-Andre Lemburg’s session on the Python database API
- Dinu Gherman’s talk on his work with Paragraphs in ReportLab
- Tommi Virtanen’s talk on Filesystem like API’s
- Ignas Mikalajunas’s talk on the benefits of eggs
- Christian Scholz’s talk on data portability
I had a 4pm talk on using SnapLogic to analyze Trac tickets, which was well
attended. There’s a lot of interest in what we’re doing, and plenty of
questions about what the practical applications are. The Trac integration
project I’m working on should help in that area.
Slides from my talk (and others) are on the EuroPython wiki talks page,
and also on the SnapLogic wiki
The timezones eventually caught up with me, and I missed Guido Von Rossum’s keynote at the end of the day.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=203
Connecting Clouds - Integration and Cloud Computing
Posted 27 June, 2008 by
mikeyp in
Cloud, SOA, ESB, EAI, ETL
1 Comment
I went to the excellent CloudCamp gathering in San Francisco Tuesday night.
CloudCamp was organized by Reuven Cohen, Jesse Silver, Dave Nielsen, and others, and there were
about 350 people there (SnapLogic also sponsored). The timing was good,
since it overlapped with a number of other conferences, which attracted a lot
of folks from out of town. Overall, I declare this a success, and I’m looking
forward to future CloudCamp events.
Cloud Computing is a broad concept, and that was reinforced by the nunber of
talks and discussions that seemed to always converge on the topic of ‘What is
cloud computing ?’
With an evolving concept like cloud computing, it’s hard to really nail down a
definition of what it is, since there really isn’t an it yet. (It reminds
me of the early days of the Internet, and the early days of the World Wide
Web…how could anyone put a definition on either of those in the early days?)
I’m not going to even attempt to define cloud computing. I’m simply treating it
as the general trend toward virtualization of computing resources. (Others have
done a good job of clarifying the space. See Peter Laird’s post on the cloud
market, or Reuven Cohen’s description.)
As of now, the cloud space can be divided into several broad market areas,
primarily:
- Software as a Service (Saas)
- Platform as a Services (PaaS)
- Infrastructure as a Service (IaaS)
There is a lot of current activity in all of these areas among vendors and
adopters. However, cloud computing will not happen overnight - it will be
a gradual transition and, current hype aside, that transition is ongoing in all
of these areas.
In the meantime, there are still a lot of existing software applications that
will not move to the ‘cloud’ any time soon. Looking at the trends, this raises
a lot of issues about how we are going to integrate all of these into a
cohesive functional unit. As a result, I chaired a session at CloudCamp to
promote some discussion on this topic, and we had 25 or 30 people in the room.
The session didn’t go as smoothly as I hoped, since there was one dissenting
voice, who insisted the solution to all these problems was local storage on the
client side. That aside, we did cover some good topics I’m summarizing here,
and adding some of my other thoughts.
The integration problem
Integration is a very broad term, that encompasses multiple levels, from low
level data interfaces to workflow and higher level business process automation.
Despite the various levels and interests in the room, there was general
consensus that integration is a real issue with cloud computing.
Some participants pointed that we haven’t fully solved the integration problem
locally. Despite the vision of SOA, and the existence of messaging, there’s
actually a lot of integration done today with Duct Tape and Paper Clips (or,
in hard currency, Perl and PL/SQL.) The larger enterprises have the
resources and skills to build their own integration capability, but smaller
business (the earliest adopters of SaaS), simply don’t have the infrastructure
in place. 65% of the enterprises surveyed by Forrested in late 2007 indicated
integration issues as their reason for not considering SaaS.
The reality is that integration matrix is still too complex, with it’s
permutation of protocols, access methods, and data formats.
Ray Wang of Forrester also notes this integration barrier is ‘more fallacy than
fact’, although I don’t agree with that latter statement - cloud raises some
significant new integration challenges, and they shouldn’t be underestimated.
Why the cloud makes integration harder
If we have difficulty with local integration, the cloud wave only makes integration
more difficult. There are a number of reasons why:
- New integration scenarios
- Access to the cloud may be limited
- Dynamic resources
- Performance
New integration scenarios
Before the cloud model, we had to tie local systems together. With the shift
to a cloud model, we now have to connect local applications to the cloud, and
we also have to connect cloud applications to each other, which adds new
permutations to the matrix.
Its unlikely that everything will move to a cloud model all at once, so even
the simplest scenarios require some form of local / remote integration. It’s
also likely that we will have applications that never leave the building, due
to regulatory constraints like HIPPA, GLBA, and general security and NPPI
issues. All of this means integration must cross a firewall somwehere.
Cloud to Cloud also raises issues. Do we rely on the (competing) vendors to
provide integration with each other ? Where is the integration hosted ? Does
the integration live in the cloud as well? And if so, how does it connect to
those local applications ?
Access may be limited
Access to cloud resources, either SaaS, PaaS, or pure infrastructure, is more
limited than local applications.
With local applications, we usually had complete access to the application,
even when there were no good integration points in the original design. With
custom applications, adding integration hooks was possible. Even with
commercial applications, it was always possible to slip in database triggers to
raise events and provide hooks for integration access.
Once applications move to the cloud, custom applications must be designed to
support integration because we no longer have that low level of access. For
SaaS, we are dependent on the vendor to provide the integration hooks and
API’s. For example, the SalesForce.com Web Services API doesn’t support
transactions against multiple records, which means integration code has to
handle that logic. For PaaS, the platform might support integration for
aplications on the platform. Platform to Platform is still an open question.
Some would argue that a limited set of APIs will improve the situation, since
backdoor access is what led integration into the ‘Duct Tape’ mess in the first
place. They have a valid point. But those API’s must be able to handle the
integration required, and the popularity of BeautifulSoup tells me screen
scraping as a workaround is alive and well.
Dynamic Resources
The true cloud model abstracts away most of our notions of physical hardware,
and everything becomes a service in some form. This is one of the benefits of
cloud, but it also means integration models change. In a world where
applications and infrastructure move and change dynamically, traditional
notions of tightly coupled integration are no longer valid. Add to this the
issues of application versioning (no longer under our control in a SaaS
environment), or PaaS platform changes (also no longer under our control) and
tight coupling becomes a dead end.
It clear from the SaaS vendors that the Web is the way to go when it comes to
client access. It seems that lower level interfaces will follow the
same REST route.
Performance
Cloud or not, we still can’t get away from physical limitations. Although we
may see better application scalability and performance, the network distances
between elements in the cloud are no longer under our control. Bandwidth isn’t
the limiting factor in most integration scenarios, round trip latency is. On
the LAN, we can optimize. In the cloud, we lose that ability, and have to live
with longer latency in combination with SLA’s from multiple vendors.
These performance constraints change the integration model. Integration can no
longer depend on high performance local access - anyone that does complex
analysis on, say, SalesForce leads, or anyone trying to pull SaaS data into
a warehouse will tell you performance is already an issue.
What this means for integration
There are a lot of implications as a result of the shift to a cloud model, and
I originally hoped we could get into the deeper issues during CloudCamp, but
time wasn’t on our side.
Cloud vendors understand that integration is an issue, particularly at the
application and platform level. We are increasingly seeing both
integration as a service, and integration as part of the platform.
The question is what form will that integration capability take ? Will it be
open and standards based like Open Social? Or will it be vendor and platform specific ?
Theres a lot at stake here,
because there a strong possibility that integration is becoming part of the
vendor lock-in strategy.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=202
What kind of service guarantees can I expect from PaaS?
Posted 14 June, 2008 by
Chris in
PaaS, web services
1 Comment
Peter Laird wrote up a great analysis on Terms of Service for SaaS providers and it got me wondering: What about the emerging Platform as a Service providers?
They’ve got an especially difficult challenge because they’re running other people’s code. How do you ensure availability when it’s not your code? Google’s App Engine conspicuously avoids the problem by not providing any SLA.
Generally speaking, there seems be three categories of providers:
Infrastructure, managed hosting and run time environments: OpSource, Etelos, Joyent, AWS, GAE, etc
Cloud IDEs: Bungee Labs, etc.
App Builders: CogHead, LongJump, etc.
Obviously, the closer you get to bare metal, the harder it is for you to provide service guarantees.
I’ve already noted that AWS provides for 99.9% in their SLA, but that’s only for the service itself. Nowhere in that document does it say anything about your image’s uptime. Seems that you are on your own entirely.
This is a tricky problem even for the app builders. They all support some kind of scripting environment and you can only sandbox so much. Not to pick on CogHead, but I found this post on their forum. So, as you can see, these problems are going to show up everywhere.
I’d love to hear from anyone that has more details on SLA from platform providers.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=201
Hi-fiving failure and a lesson on how not to treat your users…
Posted 10 June, 2008 by
Chris in
Cloud
2 Comments
I saw today via Techmeme that Twitter was excited and proud that they were able to achieve 97.3% uptime during the Apple WWDC keynote yesterday.
If I were them, I’d be a more humble and a little more circumspect. Reading thought he comments, you’d think they just landed a man on the moon.
First, let me say that under many circumstances achieving 97.3% availability is grounds for termination. Most Enterprise SLAs specify 99.9% or more with service credits applied for failure. Amazon’s SLA provides for 99.9% with 10% credited back if they fall below that and 25% credit below 99%.
Salesforce.com had some serious trouble with availability a while back and people were legitimately wondering if they would survive the crisis. Today, they make a provision for these SLA failure expenses, and so far, have been lucky enough (i.e. smart enough) not to have to make any payments.
Just to put that in perspective, 99.9% uptime translates to about 44 minutes downtime per month (99%, about 440). So, at 97.3% for the (roughly) four hours of peak time usage during the keynote the were unavailable for about 6.5 mins. or nearly 15% of the downtime budget for the month.
This isn’t’ something I would be proud of.
Second, your users are not your QA or test engineering department. They claim:
…we learned a lot during this stress test and that will translate to better performance down the line.
Finally, turning off features to support peak loads is treating the symptoms, not the underlying problem.
Is it any wonder their site is as unreliable as it is? With this kind of attitude, I don’t think things are going to materially change anytime soon.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=200
Integration remains the number one inhibitor to adopting SaaS…
Posted 9 June, 2008 by
Chris in
SnapLogic, SOA, Open Source
No Comments
CIO magizine reports on a recent survey by Forrester indicates that integration remains the number one inhibitor for Enterprises adopting SaaS.
This comes as no surprise to me.
We’ve been talking to all the large hosted application providers and they all say the same thing: My subscribers have data inside their organization and they need to get it out. The larger the company, the more likely they need to integrate their SaaS application with behind the firewall data and other applications. When integrating that data becomes more difficult (or less secure, or available) the extra integration complexity is traded off against the simplicity of a SaaS alternative. Once the complexity gets too large, it can quickly overcome the benefits of SaaS.
Integrating SaaS with internal apps remains problematic since access generally requires software that runs inside the organization to manipulate and transform the data and to initiate integration with the SaaS apps for higher security.
There are a number of hosted integration alternatives emerging, but few adequately address the behind the firewall issue. The ones do, suffer from either being too complex for the SaaS marketplace, or too limited in function to handle the task.
We here at SnapLogic have been working on the problem for a while now. We think we’ve got the right approach to this problem. We’ve got open source software that you can download and run wherever the problem lives. Our distributed approach allows you to apply the necessary access and/or transformation functions wherever it’s needed, and deploy in on top of your existing infrastructure, weather that in your data center, in the DMZ, or in a co-location facility. And since it works just like the web so you already know how it works to keep it running fast and secure.
Of all the possible deployment scenarios for SnapLogic, integration SaaS applications with enterprise apps is by far the most common.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=199
« Previous
Posted 11 July, 2008 by Chris in ESB, web services
1 Comment
Who needs it? Seriously.
I just read Jim Stogdill’s post on O’Reilly Rader that describes the new data portability hub Gnip as ‘An ESB for the Web’
It’s something that occurred to me as well when I first heard about it, but I quickly realized that, while it might be useful for some Data Portability transformations and social networking site integrations, it’s a long way from being an ESB for the web, or even useful outside of the social networking domain.

For starters, their protocol bridge doesn’t even mention SOAP (I’m not a big fan of SOAP, but lets be realistic. Every transactional application on the web today has a SOAP interface). From there, there are about 100 other things that make this seriously deficient as an ESB for the Web.
Which is perfectly fine for what they’re doing. They are working toward ‘Making Data Portability Suck Less’, which is something I really do think they can achieve with this bridge. But Jim, lets not get carried away and ascribe to them capabilities (not to mention a strategy) that they don’t seem interested in.
One thing that Jim points out that is dead-on is that web people and enterprise people think differently. Which explains why there’s no SOAP on the Gnip bridge, and why enterprise people still think they even need an ESB for the web. I could go on and dig into all the details, but it’s presented more clearly here than I ever could.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=206
SnapLogic for EC2 now available…
Posted 8 July, 2008 by
Chris in
PaaS, Cloud
No Comments
I’m happy to announce that we just release an updated version of SnapLogic for EC2.
Just like that, we’re a Cloud Computing Company. Seems like everyone, everywhere is a Cloud Computing Company. Heck, even The Onion knows this.
But seriously, this is a natural step for us. From the ground up, SnapLogic was built with the web in mind so making it available for use on Amazon’s Web Services (AWS) was a natural choice. AWS users are forward thinking and want to use the power of the web for everything they do. SnapLogic is attracting these kinds of users because everthing they’re doing to build scalable, reliable, secure web applications applies equally to their integrations with SnapLogic. Our re-usable RESTful components, web-friendly protocols and browser-based IDE fit perfectly with their PaaS and SaaS strategies.
So just as they look to hosted alternatives for applications, they’re increasingly looking for hosted alternatives for their platforms and infrastructure. SnapLogic for AWS is a perfect fit.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=205
EuroPython 2000, Day 2
Posted 8 July, 2008 by
mikeyp in
conference, Python, community
No Comments
Day 2 of EuroPython 2008 has been just as good as day 1.
Today, I attended:
-
Dinu Gherman’s talk on Visualising Relationships Between Python Objects.
This was a description of a prototype he’s working on for visualization of
Python object relationships. Based on GraphViz, it appears promising, and it
could have interesting applications for documenting and understanding code.
-
Peter Bulychev’s talk on Duplicate code detection using his Clone Digger.
Another interesting talk on a system for detecting duplicate code in Python
programs.
-
John Pinner’s talk on Conferencing systems in Python.
A good discussion on the support system for the EuroPython and PyCon UK
conferences. Conference organizing systems seem to be a endless source of
challenges for developers.
-
Raymond Hettinger’s Descriptor tutorial
A very good talk on the internals of Python descriptors.
Now, all your dots are belong to us.
-
Stefan Behnel’s introduction to the Cython compiler
Python to C, lots of uses for performance optimization and external library
interfacing.
On the unconference side of the conference , there was a good follow up to
Monday’s talk on Filesystem like API’s. Tommi Virtanen is leading this effot
to brainstorm some ideas about developing a cleaner file and filesystem
interface. The general idea is to collect some of the best ideas in path.py ,
twisted.filepath, various other libraries, and see if it’s possible to come up
with a new interface that would consolidate the best and cleanest
functionality, which currently needs multiple modules from multiple locations.
The next step will be a sprint later this week. There are some notes on the
wiki, but they’re probably not too useful unless you were in the session.
The Lightning talks for Tuesday have really good. I think
they were better than than the US PyCon lightning talks, and
EuroPython uses Swiss timing to boot!
The day wrapped up with an excellent and inspiring keynote by Hans Rosling
about the work he has done on Gapminder.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=204
EuroPython 2008 Day 1
Posted 8 July, 2008 by
mikeyp in
conference, Python, community
No Comments
I’m Vlinius at EuroPython 2008 this week. There are about 250 attendees,
which is somewhat bigger that last year.
This is already shaping up to be a good conference. The overall energy level
here is high. Everyone is enthusiastic about Python and the projects they work
on.
This is definitely a Python community event, and it has the comfortable
feeling of a unConference, while still being well organized. It’s a good
balance.
The overall feel is similar to PyCon, yet at the same time it’s subtly
different in a way that I can’t fully explain yet. It’s not just a
culture or USA / Europe thing - there’s something more to the difference.
So far, I have noticed more of a tendency to continually improve and
challenge the status quo. There is a strong spirit of innovation
and there have been many presentations about ideas and work in progress.
I fought off Jet Lag, and caught a full days of sessions for Monday.
So far, I attended:
- Marc-Andre Lemburg’s session on the Python database API
- Dinu Gherman’s talk on his work with Paragraphs in ReportLab
- Tommi Virtanen’s talk on Filesystem like API’s
- Ignas Mikalajunas’s talk on the benefits of eggs
- Christian Scholz’s talk on data portability
I had a 4pm talk on using SnapLogic to analyze Trac tickets, which was well
attended. There’s a lot of interest in what we’re doing, and plenty of
questions about what the practical applications are. The Trac integration
project I’m working on should help in that area.
Slides from my talk (and others) are on the EuroPython wiki talks page,
and also on the SnapLogic wiki
The timezones eventually caught up with me, and I missed Guido Von Rossum’s keynote at the end of the day.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=203
Connecting Clouds - Integration and Cloud Computing
Posted 27 June, 2008 by
mikeyp in
Cloud, SOA, ESB, EAI, ETL
1 Comment
I went to the excellent CloudCamp gathering in San Francisco Tuesday night.
CloudCamp was organized by Reuven Cohen, Jesse Silver, Dave Nielsen, and others, and there were
about 350 people there (SnapLogic also sponsored). The timing was good,
since it overlapped with a number of other conferences, which attracted a lot
of folks from out of town. Overall, I declare this a success, and I’m looking
forward to future CloudCamp events.
Cloud Computing is a broad concept, and that was reinforced by the nunber of
talks and discussions that seemed to always converge on the topic of ‘What is
cloud computing ?’
With an evolving concept like cloud computing, it’s hard to really nail down a
definition of what it is, since there really isn’t an it yet. (It reminds
me of the early days of the Internet, and the early days of the World Wide
Web…how could anyone put a definition on either of those in the early days?)
I’m not going to even attempt to define cloud computing. I’m simply treating it
as the general trend toward virtualization of computing resources. (Others have
done a good job of clarifying the space. See Peter Laird’s post on the cloud
market, or Reuven Cohen’s description.)
As of now, the cloud space can be divided into several broad market areas,
primarily:
- Software as a Service (Saas)
- Platform as a Services (PaaS)
- Infrastructure as a Service (IaaS)
There is a lot of current activity in all of these areas among vendors and
adopters. However, cloud computing will not happen overnight - it will be
a gradual transition and, current hype aside, that transition is ongoing in all
of these areas.
In the meantime, there are still a lot of existing software applications that
will not move to the ‘cloud’ any time soon. Looking at the trends, this raises
a lot of issues about how we are going to integrate all of these into a
cohesive functional unit. As a result, I chaired a session at CloudCamp to
promote some discussion on this topic, and we had 25 or 30 people in the room.
The session didn’t go as smoothly as I hoped, since there was one dissenting
voice, who insisted the solution to all these problems was local storage on the
client side. That aside, we did cover some good topics I’m summarizing here,
and adding some of my other thoughts.
The integration problem
Integration is a very broad term, that encompasses multiple levels, from low
level data interfaces to workflow and higher level business process automation.
Despite the various levels and interests in the room, there was general
consensus that integration is a real issue with cloud computing.
Some participants pointed that we haven’t fully solved the integration problem
locally. Despite the vision of SOA, and the existence of messaging, there’s
actually a lot of integration done today with Duct Tape and Paper Clips (or,
in hard currency, Perl and PL/SQL.) The larger enterprises have the
resources and skills to build their own integration capability, but smaller
business (the earliest adopters of SaaS), simply don’t have the infrastructure
in place. 65% of the enterprises surveyed by Forrested in late 2007 indicated
integration issues as their reason for not considering SaaS.
The reality is that integration matrix is still too complex, with it’s
permutation of protocols, access methods, and data formats.
Ray Wang of Forrester also notes this integration barrier is ‘more fallacy than
fact’, although I don’t agree with that latter statement - cloud raises some
significant new integration challenges, and they shouldn’t be underestimated.
Why the cloud makes integration harder
If we have difficulty with local integration, the cloud wave only makes integration
more difficult. There are a number of reasons why:
- New integration scenarios
- Access to the cloud may be limited
- Dynamic resources
- Performance
New integration scenarios
Before the cloud model, we had to tie local systems together. With the shift
to a cloud model, we now have to connect local applications to the cloud, and
we also have to connect cloud applications to each other, which adds new
permutations to the matrix.
Its unlikely that everything will move to a cloud model all at once, so even
the simplest scenarios require some form of local / remote integration. It’s
also likely that we will have applications that never leave the building, due
to regulatory constraints like HIPPA, GLBA, and general security and NPPI
issues. All of this means integration must cross a firewall somwehere.
Cloud to Cloud also raises issues. Do we rely on the (competing) vendors to
provide integration with each other ? Where is the integration hosted ? Does
the integration live in the cloud as well? And if so, how does it connect to
those local applications ?
Access may be limited
Access to cloud resources, either SaaS, PaaS, or pure infrastructure, is more
limited than local applications.
With local applications, we usually had complete access to the application,
even when there were no good integration points in the original design. With
custom applications, adding integration hooks was possible. Even with
commercial applications, it was always possible to slip in database triggers to
raise events and provide hooks for integration access.
Once applications move to the cloud, custom applications must be designed to
support integration because we no longer have that low level of access. For
SaaS, we are dependent on the vendor to provide the integration hooks and
API’s. For example, the SalesForce.com Web Services API doesn’t support
transactions against multiple records, which means integration code has to
handle that logic. For PaaS, the platform might support integration for
aplications on the platform. Platform to Platform is still an open question.
Some would argue that a limited set of APIs will improve the situation, since
backdoor access is what led integration into the ‘Duct Tape’ mess in the first
place. They have a valid point. But those API’s must be able to handle the
integration required, and the popularity of BeautifulSoup tells me screen
scraping as a workaround is alive and well.
Dynamic Resources
The true cloud model abstracts away most of our notions of physical hardware,
and everything becomes a service in some form. This is one of the benefits of
cloud, but it also means integration models change. In a world where
applications and infrastructure move and change dynamically, traditional
notions of tightly coupled integration are no longer valid. Add to this the
issues of application versioning (no longer under our control in a SaaS
environment), or PaaS platform changes (also no longer under our control) and
tight coupling becomes a dead end.
It clear from the SaaS vendors that the Web is the way to go when it comes to
client access. It seems that lower level interfaces will follow the
same REST route.
Performance
Cloud or not, we still can’t get away from physical limitations. Although we
may see better application scalability and performance, the network distances
between elements in the cloud are no longer under our control. Bandwidth isn’t
the limiting factor in most integration scenarios, round trip latency is. On
the LAN, we can optimize. In the cloud, we lose that ability, and have to live
with longer latency in combination with SLA’s from multiple vendors.
These performance constraints change the integration model. Integration can no
longer depend on high performance local access - anyone that does complex
analysis on, say, SalesForce leads, or anyone trying to pull SaaS data into
a warehouse will tell you performance is already an issue.
What this means for integration
There are a lot of implications as a result of the shift to a cloud model, and
I originally hoped we could get into the deeper issues during CloudCamp, but
time wasn’t on our side.
Cloud vendors understand that integration is an issue, particularly at the
application and platform level. We are increasingly seeing both
integration as a service, and integration as part of the platform.
The question is what form will that integration capability take ? Will it be
open and standards based like Open Social? Or will it be vendor and platform specific ?
Theres a lot at stake here,
because there a strong possibility that integration is becoming part of the
vendor lock-in strategy.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=202
What kind of service guarantees can I expect from PaaS?
Posted 14 June, 2008 by
Chris in
PaaS, web services
1 Comment
Peter Laird wrote up a great analysis on Terms of Service for SaaS providers and it got me wondering: What about the emerging Platform as a Service providers?
They’ve got an especially difficult challenge because they’re running other people’s code. How do you ensure availability when it’s not your code? Google’s App Engine conspicuously avoids the problem by not providing any SLA.
Generally speaking, there seems be three categories of providers:
Infrastructure, managed hosting and run time environments: OpSource, Etelos, Joyent, AWS, GAE, etc
Cloud IDEs: Bungee Labs, etc.
App Builders: CogHead, LongJump, etc.
Obviously, the closer you get to bare metal, the harder it is for you to provide service guarantees.
I’ve already noted that AWS provides for 99.9% in their SLA, but that’s only for the service itself. Nowhere in that document does it say anything about your image’s uptime. Seems that you are on your own entirely.
This is a tricky problem even for the app builders. They all support some kind of scripting environment and you can only sandbox so much. Not to pick on CogHead, but I found this post on their forum. So, as you can see, these problems are going to show up everywhere.
I’d love to hear from anyone that has more details on SLA from platform providers.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=201
Hi-fiving failure and a lesson on how not to treat your users…
Posted 10 June, 2008 by
Chris in
Cloud
2 Comments
I saw today via Techmeme that Twitter was excited and proud that they were able to achieve 97.3% uptime during the Apple WWDC keynote yesterday.
If I were them, I’d be a more humble and a little more circumspect. Reading thought he comments, you’d think they just landed a man on the moon.
First, let me say that under many circumstances achieving 97.3% availability is grounds for termination. Most Enterprise SLAs specify 99.9% or more with service credits applied for failure. Amazon’s SLA provides for 99.9% with 10% credited back if they fall below that and 25% credit below 99%.
Salesforce.com had some serious trouble with availability a while back and people were legitimately wondering if they would survive the crisis. Today, they make a provision for these SLA failure expenses, and so far, have been lucky enough (i.e. smart enough) not to have to make any payments.
Just to put that in perspective, 99.9% uptime translates to about 44 minutes downtime per month (99%, about 440). So, at 97.3% for the (roughly) four hours of peak time usage during the keynote the were unavailable for about 6.5 mins. or nearly 15% of the downtime budget for the month.
This isn’t’ something I would be proud of.
Second, your users are not your QA or test engineering department. They claim:
…we learned a lot during this stress test and that will translate to better performance down the line.
Finally, turning off features to support peak loads is treating the symptoms, not the underlying problem.
Is it any wonder their site is as unreliable as it is? With this kind of attitude, I don’t think things are going to materially change anytime soon.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=200
Integration remains the number one inhibitor to adopting SaaS…
Posted 9 June, 2008 by
Chris in
SnapLogic, SOA, Open Source
No Comments
CIO magizine reports on a recent survey by Forrester indicates that integration remains the number one inhibitor for Enterprises adopting SaaS.
This comes as no surprise to me.
We’ve been talking to all the large hosted application providers and they all say the same thing: My subscribers have data inside their organization and they need to get it out. The larger the company, the more likely they need to integrate their SaaS application with behind the firewall data and other applications. When integrating that data becomes more difficult (or less secure, or available) the extra integration complexity is traded off against the simplicity of a SaaS alternative. Once the complexity gets too large, it can quickly overcome the benefits of SaaS.
Integrating SaaS with internal apps remains problematic since access generally requires software that runs inside the organization to manipulate and transform the data and to initiate integration with the SaaS apps for higher security.
There are a number of hosted integration alternatives emerging, but few adequately address the behind the firewall issue. The ones do, suffer from either being too complex for the SaaS marketplace, or too limited in function to handle the task.
We here at SnapLogic have been working on the problem for a while now. We think we’ve got the right approach to this problem. We’ve got open source software that you can download and run wherever the problem lives. Our distributed approach allows you to apply the necessary access and/or transformation functions wherever it’s needed, and deploy in on top of your existing infrastructure, weather that in your data center, in the DMZ, or in a co-location facility. And since it works just like the web so you already know how it works to keep it running fast and secure.
Of all the possible deployment scenarios for SnapLogic, integration SaaS applications with enterprise apps is by far the most common.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=199
« Previous
Posted 8 July, 2008 by Chris in PaaS, Cloud
No Comments
I’m happy to announce that we just release an updated version of SnapLogic for EC2.
Just like that, we’re a Cloud Computing Company. Seems like everyone, everywhere is a Cloud Computing Company. Heck, even The Onion knows this.
But seriously, this is a natural step for us. From the ground up, SnapLogic was built with the web in mind so making it available for use on Amazon’s Web Services (AWS) was a natural choice. AWS users are forward thinking and want to use the power of the web for everything they do. SnapLogic is attracting these kinds of users because everthing they’re doing to build scalable, reliable, secure web applications applies equally to their integrations with SnapLogic. Our re-usable RESTful components, web-friendly protocols and browser-based IDE fit perfectly with their PaaS and SaaS strategies.
So just as they look to hosted alternatives for applications, they’re increasingly looking for hosted alternatives for their platforms and infrastructure. SnapLogic for AWS is a perfect fit.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=205
EuroPython 2000, Day 2
Posted 8 July, 2008 by
mikeyp in
conference, Python, community
No Comments
Day 2 of EuroPython 2008 has been just as good as day 1.
Today, I attended:
-
Dinu Gherman’s talk on Visualising Relationships Between Python Objects.
This was a description of a prototype he’s working on for visualization of
Python object relationships. Based on GraphViz, it appears promising, and it
could have interesting applications for documenting and understanding code.
-
Peter Bulychev’s talk on Duplicate code detection using his Clone Digger.
Another interesting talk on a system for detecting duplicate code in Python
programs.
-
John Pinner’s talk on Conferencing systems in Python.
A good discussion on the support system for the EuroPython and PyCon UK
conferences. Conference organizing systems seem to be a endless source of
challenges for developers.
-
Raymond Hettinger’s Descriptor tutorial
A very good talk on the internals of Python descriptors.
Now, all your dots are belong to us.
-
Stefan Behnel’s introduction to the Cython compiler
Python to C, lots of uses for performance optimization and external library
interfacing.
On the unconference side of the conference , there was a good follow up to
Monday’s talk on Filesystem like API’s. Tommi Virtanen is leading this effot
to brainstorm some ideas about developing a cleaner file and filesystem
interface. The general idea is to collect some of the best ideas in path.py ,
twisted.filepath, various other libraries, and see if it’s possible to come up
with a new interface that would consolidate the best and cleanest
functionality, which currently needs multiple modules from multiple locations.
The next step will be a sprint later this week. There are some notes on the
wiki, but they’re probably not too useful unless you were in the session.
The Lightning talks for Tuesday have really good. I think
they were better than than the US PyCon lightning talks, and
EuroPython uses Swiss timing to boot!
The day wrapped up with an excellent and inspiring keynote by Hans Rosling
about the work he has done on Gapminder.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=204
EuroPython 2008 Day 1
Posted 8 July, 2008 by
mikeyp in
conference, Python, community
No Comments
I’m Vlinius at EuroPython 2008 this week. There are about 250 attendees,
which is somewhat bigger that last year.
This is already shaping up to be a good conference. The overall energy level
here is high. Everyone is enthusiastic about Python and the projects they work
on.
This is definitely a Python community event, and it has the comfortable
feeling of a unConference, while still being well organized. It’s a good
balance.
The overall feel is similar to PyCon, yet at the same time it’s subtly
different in a way that I can’t fully explain yet. It’s not just a
culture or USA / Europe thing - there’s something more to the difference.
So far, I have noticed more of a tendency to continually improve and
challenge the status quo. There is a strong spirit of innovation
and there have been many presentations about ideas and work in progress.
I fought off Jet Lag, and caught a full days of sessions for Monday.
So far, I attended:
- Marc-Andre Lemburg’s session on the Python database API
- Dinu Gherman’s talk on his work with Paragraphs in ReportLab
- Tommi Virtanen’s talk on Filesystem like API’s
- Ignas Mikalajunas’s talk on the benefits of eggs
- Christian Scholz’s talk on data portability
I had a 4pm talk on using SnapLogic to analyze Trac tickets, which was well
attended. There’s a lot of interest in what we’re doing, and plenty of
questions about what the practical applications are. The Trac integration
project I’m working on should help in that area.
Slides from my talk (and others) are on the EuroPython wiki talks page,
and also on the SnapLogic wiki
The timezones eventually caught up with me, and I missed Guido Von Rossum’s keynote at the end of the day.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=203
Connecting Clouds - Integration and Cloud Computing
Posted 27 June, 2008 by
mikeyp in
Cloud, SOA, ESB, EAI, ETL
1 Comment
I went to the excellent CloudCamp gathering in San Francisco Tuesday night.
CloudCamp was organized by Reuven Cohen, Jesse Silver, Dave Nielsen, and others, and there were
about 350 people there (SnapLogic also sponsored). The timing was good,
since it overlapped with a number of other conferences, which attracted a lot
of folks from out of town. Overall, I declare this a success, and I’m looking
forward to future CloudCamp events.
Cloud Computing is a broad concept, and that was reinforced by the nunber of
talks and discussions that seemed to always converge on the topic of ‘What is
cloud computing ?’
With an evolving concept like cloud computing, it’s hard to really nail down a
definition of what it is, since there really isn’t an it yet. (It reminds
me of the early days of the Internet, and the early days of the World Wide
Web…how could anyone put a definition on either of those in the early days?)
I’m not going to even attempt to define cloud computing. I’m simply treating it
as the general trend toward virtualization of computing resources. (Others have
done a good job of clarifying the space. See Peter Laird’s post on the cloud
market, or Reuven Cohen’s description.)
As of now, the cloud space can be divided into several broad market areas,
primarily:
- Software as a Service (Saas)
- Platform as a Services (PaaS)
- Infrastructure as a Service (IaaS)
There is a lot of current activity in all of these areas among vendors and
adopters. However, cloud computing will not happen overnight - it will be
a gradual transition and, current hype aside, that transition is ongoing in all
of these areas.
In the meantime, there are still a lot of existing software applications that
will not move to the ‘cloud’ any time soon. Looking at the trends, this raises
a lot of issues about how we are going to integrate all of these into a
cohesive functional unit. As a result, I chaired a session at CloudCamp to
promote some discussion on this topic, and we had 25 or 30 people in the room.
The session didn’t go as smoothly as I hoped, since there was one dissenting
voice, who insisted the solution to all these problems was local storage on the
client side. That aside, we did cover some good topics I’m summarizing here,
and adding some of my other thoughts.
The integration problem
Integration is a very broad term, that encompasses multiple levels, from low
level data interfaces to workflow and higher level business process automation.
Despite the various levels and interests in the room, there was general
consensus that integration is a real issue with cloud computing.
Some participants pointed that we haven’t fully solved the integration problem
locally. Despite the vision of SOA, and the existence of messaging, there’s
actually a lot of integration done today with Duct Tape and Paper Clips (or,
in hard currency, Perl and PL/SQL.) The larger enterprises have the
resources and skills to build their own integration capability, but smaller
business (the earliest adopters of SaaS), simply don’t have the infrastructure
in place. 65% of the enterprises surveyed by Forrested in late 2007 indicated
integration issues as their reason for not considering SaaS.
The reality is that integration matrix is still too complex, with it’s
permutation of protocols, access methods, and data formats.
Ray Wang of Forrester also notes this integration barrier is ‘more fallacy than
fact’, although I don’t agree with that latter statement - cloud raises some
significant new integration challenges, and they shouldn’t be underestimated.
Why the cloud makes integration harder
If we have difficulty with local integration, the cloud wave only makes integration
more difficult. There are a number of reasons why:
- New integration scenarios
- Access to the cloud may be limited
- Dynamic resources
- Performance
New integration scenarios
Before the cloud model, we had to tie local systems together. With the shift
to a cloud model, we now have to connect local applications to the cloud, and
we also have to connect cloud applications to each other, which adds new
permutations to the matrix.
Its unlikely that everything will move to a cloud model all at once, so even
the simplest scenarios require some form of local / remote integration. It’s
also likely that we will have applications that never leave the building, due
to regulatory constraints like HIPPA, GLBA, and general security and NPPI
issues. All of this means integration must cross a firewall somwehere.
Cloud to Cloud also raises issues. Do we rely on the (competing) vendors to
provide integration with each other ? Where is the integration hosted ? Does
the integration live in the cloud as well? And if so, how does it connect to
those local applications ?
Access may be limited
Access to cloud resources, either SaaS, PaaS, or pure infrastructure, is more
limited than local applications.
With local applications, we usually had complete access to the application,
even when there were no good integration points in the original design. With
custom applications, adding integration hooks was possible. Even with
commercial applications, it was always possible to slip in database triggers to
raise events and provide hooks for integration access.
Once applications move to the cloud, custom applications must be designed to
support integration because we no longer have that low level of access. For
SaaS, we are dependent on the vendor to provide the integration hooks and
API’s. For example, the SalesForce.com Web Services API doesn’t support
transactions against multiple records, which means integration code has to
handle that logic. For PaaS, the platform might support integration for
aplications on the platform. Platform to Platform is still an open question.
Some would argue that a limited set of APIs will improve the situation, since
backdoor access is what led integration into the ‘Duct Tape’ mess in the first
place. They have a valid point. But those API’s must be able to handle the
integration required, and the popularity of BeautifulSoup tells me screen
scraping as a workaround is alive and well.
Dynamic Resources
The true cloud model abstracts away most of our notions of physical hardware,
and everything becomes a service in some form. This is one of the benefits of
cloud, but it also means integration models change. In a world where
applications and infrastructure move and change dynamically, traditional
notions of tightly coupled integration are no longer valid. Add to this the
issues of application versioning (no longer under our control in a SaaS
environment), or PaaS platform changes (also no longer under our control) and
tight coupling becomes a dead end.
It clear from the SaaS vendors that the Web is the way to go when it comes to
client access. It seems that lower level interfaces will follow the
same REST route.
Performance
Cloud or not, we still can’t get away from physical limitations. Although we
may see better application scalability and performance, the network distances
between elements in the cloud are no longer under our control. Bandwidth isn’t
the limiting factor in most integration scenarios, round trip latency is. On
the LAN, we can optimize. In the cloud, we lose that ability, and have to live
with longer latency in combination with SLA’s from multiple vendors.
These performance constraints change the integration model. Integration can no
longer depend on high performance local access - anyone that does complex
analysis on, say, SalesForce leads, or anyone trying to pull SaaS data into
a warehouse will tell you performance is already an issue.
What this means for integration
There are a lot of implications as a result of the shift to a cloud model, and
I originally hoped we could get into the deeper issues during CloudCamp, but
time wasn’t on our side.
Cloud vendors understand that integration is an issue, particularly at the
application and platform level. We are increasingly seeing both
integration as a service, and integration as part of the platform.
The question is what form will that integration capability take ? Will it be
open and standards based like Open Social? Or will it be vendor and platform specific ?
Theres a lot at stake here,
because there a strong possibility that integration is becoming part of the
vendor lock-in strategy.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=202
What kind of service guarantees can I expect from PaaS?
Posted 14 June, 2008 by
Chris in
PaaS, web services
1 Comment
Peter Laird wrote up a great analysis on Terms of Service for SaaS providers and it got me wondering: What about the emerging Platform as a Service providers?
They’ve got an especially difficult challenge because they’re running other people’s code. How do you ensure availability when it’s not your code? Google’s App Engine conspicuously avoids the problem by not providing any SLA.
Generally speaking, there seems be three categories of providers:
Infrastructure, managed hosting and run time environments: OpSource, Etelos, Joyent, AWS, GAE, etc
Cloud IDEs: Bungee Labs, etc.
App Builders: CogHead, LongJump, etc.
Obviously, the closer you get to bare metal, the harder it is for you to provide service guarantees.
I’ve already noted that AWS provides for 99.9% in their SLA, but that’s only for the service itself. Nowhere in that document does it say anything about your image’s uptime. Seems that you are on your own entirely.
This is a tricky problem even for the app builders. They all support some kind of scripting environment and you can only sandbox so much. Not to pick on CogHead, but I found this post on their forum. So, as you can see, these problems are going to show up everywhere.
I’d love to hear from anyone that has more details on SLA from platform providers.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=201
Hi-fiving failure and a lesson on how not to treat your users…
Posted 10 June, 2008 by
Chris in
Cloud
2 Comments
I saw today via Techmeme that Twitter was excited and proud that they were able to achieve 97.3% uptime during the Apple WWDC keynote yesterday.
If I were them, I’d be a more humble and a little more circumspect. Reading thought he comments, you’d think they just landed a man on the moon.
First, let me say that under many circumstances achieving 97.3% availability is grounds for termination. Most Enterprise SLAs specify 99.9% or more with service credits applied for failure. Amazon’s SLA provides for 99.9% with 10% credited back if they fall below that and 25% credit below 99%.
Salesforce.com had some serious trouble with availability a while back and people were legitimately wondering if they would survive the crisis. Today, they make a provision for these SLA failure expenses, and so far, have been lucky enough (i.e. smart enough) not to have to make any payments.
Just to put that in perspective, 99.9% uptime translates to about 44 minutes downtime per month (99%, about 440). So, at 97.3% for the (roughly) four hours of peak time usage during the keynote the were unavailable for about 6.5 mins. or nearly 15% of the downtime budget for the month.
This isn’t’ something I would be proud of.
Second, your users are not your QA or test engineering department. They claim:
…we learned a lot during this stress test and that will translate to better performance down the line.
Finally, turning off features to support peak loads is treating the symptoms, not the underlying problem.
Is it any wonder their site is as unreliable as it is? With this kind of attitude, I don’t think things are going to materially change anytime soon.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=200
Integration remains the number one inhibitor to adopting SaaS…
Posted 9 June, 2008 by
Chris in
SnapLogic, SOA, Open Source
No Comments
CIO magizine reports on a recent survey by Forrester indicates that integration remains the number one inhibitor for Enterprises adopting SaaS.
This comes as no surprise to me.
We’ve been talking to all the large hosted application providers and they all say the same thing: My subscribers have data inside their organization and they need to get it out. The larger the company, the more likely they need to integrate their SaaS application with behind the firewall data and other applications. When integrating that data becomes more difficult (or less secure, or available) the extra integration complexity is traded off against the simplicity of a SaaS alternative. Once the complexity gets too large, it can quickly overcome the benefits of SaaS.
Integrating SaaS with internal apps remains problematic since access generally requires software that runs inside the organization to manipulate and transform the data and to initiate integration with the SaaS apps for higher security.
There are a number of hosted integration alternatives emerging, but few adequately address the behind the firewall issue. The ones do, suffer from either being too complex for the SaaS marketplace, or too limited in function to handle the task.
We here at SnapLogic have been working on the problem for a while now. We think we’ve got the right approach to this problem. We’ve got open source software that you can download and run wherever the problem lives. Our distributed approach allows you to apply the necessary access and/or transformation functions wherever it’s needed, and deploy in on top of your existing infrastructure, weather that in your data center, in the DMZ, or in a co-location facility. And since it works just like the web so you already know how it works to keep it running fast and secure.
Of all the possible deployment scenarios for SnapLogic, integration SaaS applications with enterprise apps is by far the most common.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=199
« Previous
Posted 8 July, 2008 by mikeyp in conference, Python, community
No Comments
Day 2 of EuroPython 2008 has been just as good as day 1.
Today, I attended:
-
Dinu Gherman’s talk on Visualising Relationships Between Python Objects.
This was a description of a prototype he’s working on for visualization of Python object relationships. Based on GraphViz, it appears promising, and it could have interesting applications for documenting and understanding code.
-
Peter Bulychev’s talk on Duplicate code detection using his Clone Digger.
Another interesting talk on a system for detecting duplicate code in Python programs.
-
John Pinner’s talk on Conferencing systems in Python.
A good discussion on the support system for the EuroPython and PyCon UK conferences. Conference organizing systems seem to be a endless source of challenges for developers.
-
Raymond Hettinger’s Descriptor tutorial
A very good talk on the internals of Python descriptors. Now, all your dots are belong to us.
-
Stefan Behnel’s introduction to the Cython compiler
Python to C, lots of uses for performance optimization and external library interfacing.
On the unconference side of the conference , there was a good follow up to Monday’s talk on Filesystem like API’s. Tommi Virtanen is leading this effot to brainstorm some ideas about developing a cleaner file and filesystem interface. The general idea is to collect some of the best ideas in path.py , twisted.filepath, various other libraries, and see if it’s possible to come up with a new interface that would consolidate the best and cleanest functionality, which currently needs multiple modules from multiple locations. The next step will be a sprint later this week. There are some notes on the wiki, but they’re probably not too useful unless you were in the session.
The Lightning talks for Tuesday have really good. I think they were better than than the US PyCon lightning talks, and EuroPython uses Swiss timing to boot!
The day wrapped up with an excellent and inspiring keynote by Hans Rosling about the work he has done on Gapminder.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=204
EuroPython 2008 Day 1
Posted 8 July, 2008 by
mikeyp in
conference, Python, community
No Comments
I’m Vlinius at EuroPython 2008 this week. There are about 250 attendees,
which is somewhat bigger that last year.
This is already shaping up to be a good conference. The overall energy level
here is high. Everyone is enthusiastic about Python and the projects they work
on.
This is definitely a Python community event, and it has the comfortable
feeling of a unConference, while still being well organized. It’s a good
balance.
The overall feel is similar to PyCon, yet at the same time it’s subtly
different in a way that I can’t fully explain yet. It’s not just a
culture or USA / Europe thing - there’s something more to the difference.
So far, I have noticed more of a tendency to continually improve and
challenge the status quo. There is a strong spirit of innovation
and there have been many presentations about ideas and work in progress.
I fought off Jet Lag, and caught a full days of sessions for Monday.
So far, I attended:
- Marc-Andre Lemburg’s session on the Python database API
- Dinu Gherman’s talk on his work with Paragraphs in ReportLab
- Tommi Virtanen’s talk on Filesystem like API’s
- Ignas Mikalajunas’s talk on the benefits of eggs
- Christian Scholz’s talk on data portability
I had a 4pm talk on using SnapLogic to analyze Trac tickets, which was well
attended. There’s a lot of interest in what we’re doing, and plenty of
questions about what the practical applications are. The Trac integration
project I’m working on should help in that area.
Slides from my talk (and others) are on the EuroPython wiki talks page,
and also on the SnapLogic wiki
The timezones eventually caught up with me, and I missed Guido Von Rossum’s keynote at the end of the day.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=203
Connecting Clouds - Integration and Cloud Computing
Posted 27 June, 2008 by
mikeyp in
Cloud, SOA, ESB, EAI, ETL
1 Comment
I went to the excellent CloudCamp gathering in San Francisco Tuesday night.
CloudCamp was organized by Reuven Cohen, Jesse Silver, Dave Nielsen, and others, and there were
about 350 people there (SnapLogic also sponsored). The timing was good,
since it overlapped with a number of other conferences, which attracted a lot
of folks from out of town. Overall, I declare this a success, and I’m looking
forward to future CloudCamp events.
Cloud Computing is a broad concept, and that was reinforced by the nunber of
talks and discussions that seemed to always converge on the topic of ‘What is
cloud computing ?’
With an evolving concept like cloud computing, it’s hard to really nail down a
definition of what it is, since there really isn’t an it yet. (It reminds
me of the early days of the Internet, and the early days of the World Wide
Web…how could anyone put a definition on either of those in the early days?)
I’m not going to even attempt to define cloud computing. I’m simply treating it
as the general trend toward virtualization of computing resources. (Others have
done a good job of clarifying the space. See Peter Laird’s post on the cloud
market, or Reuven Cohen’s description.)
As of now, the cloud space can be divided into several broad market areas,
primarily:
- Software as a Service (Saas)
- Platform as a Services (PaaS)
- Infrastructure as a Service (IaaS)
There is a lot of current activity in all of these areas among vendors and
adopters. However, cloud computing will not happen overnight - it will be
a gradual transition and, current hype aside, that transition is ongoing in all
of these areas.
In the meantime, there are still a lot of existing software applications that
will not move to the ‘cloud’ any time soon. Looking at the trends, this raises
a lot of issues about how we are going to integrate all of these into a
cohesive functional unit. As a result, I chaired a session at CloudCamp to
promote some discussion on this topic, and we had 25 or 30 people in the room.
The session didn’t go as smoothly as I hoped, since there was one dissenting
voice, who insisted the solution to all these problems was local storage on the
client side. That aside, we did cover some good topics I’m summarizing here,
and adding some of my other thoughts.
The integration problem
Integration is a very broad term, that encompasses multiple levels, from low
level data interfaces to workflow and higher level business process automation.
Despite the various levels and interests in the room, there was general
consensus that integration is a real issue with cloud computing.
Some participants pointed that we haven’t fully solved the integration problem
locally. Despite the vision of SOA, and the existence of messaging, there’s
actually a lot of integration done today with Duct Tape and Paper Clips (or,
in hard currency, Perl and PL/SQL.) The larger enterprises have the
resources and skills to build their own integration capability, but smaller
business (the earliest adopters of SaaS), simply don’t have the infrastructure
in place. 65% of the enterprises surveyed by Forrested in late 2007 indicated
integration issues as their reason for not considering SaaS.
The reality is that integration matrix is still too complex, with it’s
permutation of protocols, access methods, and data formats.
Ray Wang of Forrester also notes this integration barrier is ‘more fallacy than
fact’, although I don’t agree with that latter statement - cloud raises some
significant new integration challenges, and they shouldn’t be underestimated.
Why the cloud makes integration harder
If we have difficulty with local integration, the cloud wave only makes integration
more difficult. There are a number of reasons why:
- New integration scenarios
- Access to the cloud may be limited
- Dynamic resources
- Performance
New integration scenarios
Before the cloud model, we had to tie local systems together. With the shift
to a cloud model, we now have to connect local applications to the cloud, and
we also have to connect cloud applications to each other, which adds new
permutations to the matrix.
Its unlikely that everything will move to a cloud model all at once, so even
the simplest scenarios require some form of local / remote integration. It’s
also likely that we will have applications that never leave the building, due
to regulatory constraints like HIPPA, GLBA, and general security and NPPI
issues. All of this means integration must cross a firewall somwehere.
Cloud to Cloud also raises issues. Do we rely on the (competing) vendors to
provide integration with each other ? Where is the integration hosted ? Does
the integration live in the cloud as well? And if so, how does it connect to
those local applications ?
Access may be limited
Access to cloud resources, either SaaS, PaaS, or pure infrastructure, is more
limited than local applications.
With local applications, we usually had complete access to the application,
even when there were no good integration points in the original design. With
custom applications, adding integration hooks was possible. Even with
commercial applications, it was always possible to slip in database triggers to
raise events and provide hooks for integration access.
Once applications move to the cloud, custom applications must be designed to
support integration because we no longer have that low level of access. For
SaaS, we are dependent on the vendor to provide the integration hooks and
API’s. For example, the SalesForce.com Web Services API doesn’t support
transactions against multiple records, which means integration code has to
handle that logic. For PaaS, the platform might support integration for
aplications on the platform. Platform to Platform is still an open question.
Some would argue that a limited set of APIs will improve the situation, since
backdoor access is what led integration into the ‘Duct Tape’ mess in the first
place. They have a valid point. But those API’s must be able to handle the
integration required, and the popularity of BeautifulSoup tells me screen
scraping as a workaround is alive and well.
Dynamic Resources
The true cloud model abstracts away most of our notions of physical hardware,
and everything becomes a service in some form. This is one of the benefits of
cloud, but it also means integration models change. In a world where
applications and infrastructure move and change dynamically, traditional
notions of tightly coupled integration are no longer valid. Add to this the
issues of application versioning (no longer under our control in a SaaS
environment), or PaaS platform changes (also no longer under our control) and
tight coupling becomes a dead end.
It clear from the SaaS vendors that the Web is the way to go when it comes to
client access. It seems that lower level interfaces will follow the
same REST route.
Performance
Cloud or not, we still can’t get away from physical limitations. Although we
may see better application scalability and performance, the network distances
between elements in the cloud are no longer under our control. Bandwidth isn’t
the limiting factor in most integration scenarios, round trip latency is. On
the LAN, we can optimize. In the cloud, we lose that ability, and have to live
with longer latency in combination with SLA’s from multiple vendors.
These performance constraints change the integration model. Integration can no
longer depend on high performance local access - anyone that does complex
analysis on, say, SalesForce leads, or anyone trying to pull SaaS data into
a warehouse will tell you performance is already an issue.
What this means for integration
There are a lot of implications as a result of the shift to a cloud model, and
I originally hoped we could get into the deeper issues during CloudCamp, but
time wasn’t on our side.
Cloud vendors understand that integration is an issue, particularly at the
application and platform level. We are increasingly seeing both
integration as a service, and integration as part of the platform.
The question is what form will that integration capability take ? Will it be
open and standards based like Open Social? Or will it be vendor and platform specific ?
Theres a lot at stake here,
because there a strong possibility that integration is becoming part of the
vendor lock-in strategy.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=202
What kind of service guarantees can I expect from PaaS?
Posted 14 June, 2008 by
Chris in
PaaS, web services
1 Comment
Peter Laird wrote up a great analysis on Terms of Service for SaaS providers and it got me wondering: What about the emerging Platform as a Service providers?
They’ve got an especially difficult challenge because they’re running other people’s code. How do you ensure availability when it’s not your code? Google’s App Engine conspicuously avoids the problem by not providing any SLA.
Generally speaking, there seems be three categories of providers:
Infrastructure, managed hosting and run time environments: OpSource, Etelos, Joyent, AWS, GAE, etc
Cloud IDEs: Bungee Labs, etc.
App Builders: CogHead, LongJump, etc.
Obviously, the closer you get to bare metal, the harder it is for you to provide service guarantees.
I’ve already noted that AWS provides for 99.9% in their SLA, but that’s only for the service itself. Nowhere in that document does it say anything about your image’s uptime. Seems that you are on your own entirely.
This is a tricky problem even for the app builders. They all support some kind of scripting environment and you can only sandbox so much. Not to pick on CogHead, but I found this post on their forum. So, as you can see, these problems are going to show up everywhere.
I’d love to hear from anyone that has more details on SLA from platform providers.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=201
Hi-fiving failure and a lesson on how not to treat your users…
Posted 10 June, 2008 by
Chris in
Cloud
2 Comments
I saw today via Techmeme that Twitter was excited and proud that they were able to achieve 97.3% uptime during the Apple WWDC keynote yesterday.
If I were them, I’d be a more humble and a little more circumspect. Reading thought he comments, you’d think they just landed a man on the moon.
First, let me say that under many circumstances achieving 97.3% availability is grounds for termination. Most Enterprise SLAs specify 99.9% or more with service credits applied for failure. Amazon’s SLA provides for 99.9% with 10% credited back if they fall below that and 25% credit below 99%.
Salesforce.com had some serious trouble with availability a while back and people were legitimately wondering if they would survive the crisis. Today, they make a provision for these SLA failure expenses, and so far, have been lucky enough (i.e. smart enough) not to have to make any payments.
Just to put that in perspective, 99.9% uptime translates to about 44 minutes downtime per month (99%, about 440). So, at 97.3% for the (roughly) four hours of peak time usage during the keynote the were unavailable for about 6.5 mins. or nearly 15% of the downtime budget for the month.
This isn’t’ something I would be proud of.
Second, your users are not your QA or test engineering department. They claim:
…we learned a lot during this stress test and that will translate to better performance down the line.
Finally, turning off features to support peak loads is treating the symptoms, not the underlying problem.
Is it any wonder their site is as unreliable as it is? With this kind of attitude, I don’t think things are going to materially change anytime soon.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=200
Integration remains the number one inhibitor to adopting SaaS…
Posted 9 June, 2008 by
Chris in
SnapLogic, SOA, Open Source
No Comments
CIO magizine reports on a recent survey by Forrester indicates that integration remains the number one inhibitor for Enterprises adopting SaaS.
This comes as no surprise to me.
We’ve been talking to all the large hosted application providers and they all say the same thing: My subscribers have data inside their organization and they need to get it out. The larger the company, the more likely they need to integrate their SaaS application with behind the firewall data and other applications. When integrating that data becomes more difficult (or less secure, or available) the extra integration complexity is traded off against the simplicity of a SaaS alternative. Once the complexity gets too large, it can quickly overcome the benefits of SaaS.
Integrating SaaS with internal apps remains problematic since access generally requires software that runs inside the organization to manipulate and transform the data and to initiate integration with the SaaS apps for higher security.
There are a number of hosted integration alternatives emerging, but few adequately address the behind the firewall issue. The ones do, suffer from either being too complex for the SaaS marketplace, or too limited in function to handle the task.
We here at SnapLogic have been working on the problem for a while now. We think we’ve got the right approach to this problem. We’ve got open source software that you can download and run wherever the problem lives. Our distributed approach allows you to apply the necessary access and/or transformation functions wherever it’s needed, and deploy in on top of your existing infrastructure, weather that in your data center, in the DMZ, or in a co-location facility. And since it works just like the web so you already know how it works to keep it running fast and secure.
Of all the possible deployment scenarios for SnapLogic, integration SaaS applications with enterprise apps is by far the most common.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=199
« Previous
Posted 8 July, 2008 by mikeyp in conference, Python, community
No Comments
I’m Vlinius at EuroPython 2008 this week. There are about 250 attendees, which is somewhat bigger that last year.
This is already shaping up to be a good conference. The overall energy level here is high. Everyone is enthusiastic about Python and the projects they work on.
This is definitely a Python community event, and it has the comfortable feeling of a unConference, while still being well organized. It’s a good balance.
The overall feel is similar to PyCon, yet at the same time it’s subtly different in a way that I can’t fully explain yet. It’s not just a culture or USA / Europe thing - there’s something more to the difference. So far, I have noticed more of a tendency to continually improve and challenge the status quo. There is a strong spirit of innovation and there have been many presentations about ideas and work in progress.
I fought off Jet Lag, and caught a full days of sessions for Monday.
So far, I attended:
- Marc-Andre Lemburg’s session on the Python database API
- Dinu Gherman’s talk on his work with Paragraphs in ReportLab
- Tommi Virtanen’s talk on Filesystem like API’s
- Ignas Mikalajunas’s talk on the benefits of eggs
- Christian Scholz’s talk on data portability
I had a 4pm talk on using SnapLogic to analyze Trac tickets, which was well attended. There’s a lot of interest in what we’re doing, and plenty of questions about what the practical applications are. The Trac integration project I’m working on should help in that area.
Slides from my talk (and others) are on the EuroPython wiki talks page, and also on the SnapLogic wiki
The timezones eventually caught up with me, and I missed Guido Von Rossum’s keynote at the end of the day.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=203
Connecting Clouds - Integration and Cloud Computing
Posted 27 June, 2008 by
mikeyp in
Cloud, SOA, ESB, EAI, ETL
1 Comment
I went to the excellent CloudCamp gathering in San Francisco Tuesday night.
CloudCamp was organized by Reuven Cohen, Jesse Silver, Dave Nielsen, and others, and there were
about 350 people there (SnapLogic also sponsored). The timing was good,
since it overlapped with a number of other conferences, which attracted a lot
of folks from out of town. Overall, I declare this a success, and I’m looking
forward to future CloudCamp events.
Cloud Computing is a broad concept, and that was reinforced by the nunber of
talks and discussions that seemed to always converge on the topic of ‘What is
cloud computing ?’
With an evolving concept like cloud computing, it’s hard to really nail down a
definition of what it is, since there really isn’t an it yet. (It reminds
me of the early days of the Internet, and the early days of the World Wide
Web…how could anyone put a definition on either of those in the early days?)
I’m not going to even attempt to define cloud computing. I’m simply treating it
as the general trend toward virtualization of computing resources. (Others have
done a good job of clarifying the space. See Peter Laird’s post on the cloud
market, or Reuven Cohen’s description.)
As of now, the cloud space can be divided into several broad market areas,
primarily:
- Software as a Service (Saas)
- Platform as a Services (PaaS)
- Infrastructure as a Service (IaaS)
There is a lot of current activity in all of these areas among vendors and
adopters. However, cloud computing will not happen overnight - it will be
a gradual transition and, current hype aside, that transition is ongoing in all
of these areas.
In the meantime, there are still a lot of existing software applications that
will not move to the ‘cloud’ any time soon. Looking at the trends, this raises
a lot of issues about how we are going to integrate all of these into a
cohesive functional unit. As a result, I chaired a session at CloudCamp to
promote some discussion on this topic, and we had 25 or 30 people in the room.
The session didn’t go as smoothly as I hoped, since there was one dissenting
voice, who insisted the solution to all these problems was local storage on the
client side. That aside, we did cover some good topics I’m summarizing here,
and adding some of my other thoughts.
The integration problem
Integration is a very broad term, that encompasses multiple levels, from low
level data interfaces to workflow and higher level business process automation.
Despite the various levels and interests in the room, there was general
consensus that integration is a real issue with cloud computing.
Some participants pointed that we haven’t fully solved the integration problem
locally. Despite the vision of SOA, and the existence of messaging, there’s
actually a lot of integration done today with Duct Tape and Paper Clips (or,
in hard currency, Perl and PL/SQL.) The larger enterprises have the
resources and skills to build their own integration capability, but smaller
business (the earliest adopters of SaaS), simply don’t have the infrastructure
in place. 65% of the enterprises surveyed by Forrested in late 2007 indicated
integration issues as their reason for not considering SaaS.
The reality is that integration matrix is still too complex, with it’s
permutation of protocols, access methods, and data formats.
Ray Wang of Forrester also notes this integration barrier is ‘more fallacy than
fact’, although I don’t agree with that latter statement - cloud raises some
significant new integration challenges, and they shouldn’t be underestimated.
Why the cloud makes integration harder
If we have difficulty with local integration, the cloud wave only makes integration
more difficult. There are a number of reasons why:
- New integration scenarios
- Access to the cloud may be limited
- Dynamic resources
- Performance
New integration scenarios
Before the cloud model, we had to tie local systems together. With the shift
to a cloud model, we now have to connect local applications to the cloud, and
we also have to connect cloud applications to each other, which adds new
permutations to the matrix.
Its unlikely that everything will move to a cloud model all at once, so even
the simplest scenarios require some form of local / remote integration. It’s
also likely that we will have applications that never leave the building, due
to regulatory constraints like HIPPA, GLBA, and general security and NPPI
issues. All of this means integration must cross a firewall somwehere.
Cloud to Cloud also raises issues. Do we rely on the (competing) vendors to
provide integration with each other ? Where is the integration hosted ? Does
the integration live in the cloud as well? And if so, how does it connect to
those local applications ?
Access may be limited
Access to cloud resources, either SaaS, PaaS, or pure infrastructure, is more
limited than local applications.
With local applications, we usually had complete access to the application,
even when there were no good integration points in the original design. With
custom applications, adding integration hooks was possible. Even with
commercial applications, it was always possible to slip in database triggers to
raise events and provide hooks for integration access.
Once applications move to the cloud, custom applications must be designed to
support integration because we no longer have that low level of access. For
SaaS, we are dependent on the vendor to provide the integration hooks and
API’s. For example, the SalesForce.com Web Services API doesn’t support
transactions against multiple records, which means integration code has to
handle that logic. For PaaS, the platform might support integration for
aplications on the platform. Platform to Platform is still an open question.
Some would argue that a limited set of APIs will improve the situation, since
backdoor access is what led integration into the ‘Duct Tape’ mess in the first
place. They have a valid point. But those API’s must be able to handle the
integration required, and the popularity of BeautifulSoup tells me screen
scraping as a workaround is alive and well.
Dynamic Resources
The true cloud model abstracts away most of our notions of physical hardware,
and everything becomes a service in some form. This is one of the benefits of
cloud, but it also means integration models change. In a world where
applications and infrastructure move and change dynamically, traditional
notions of tightly coupled integration are no longer valid. Add to this the
issues of application versioning (no longer under our control in a SaaS
environment), or PaaS platform changes (also no longer under our control) and
tight coupling becomes a dead end.
It clear from the SaaS vendors that the Web is the way to go when it comes to
client access. It seems that lower level interfaces will follow the
same REST route.
Performance
Cloud or not, we still can’t get away from physical limitations. Although we
may see better application scalability and performance, the network distances
between elements in the cloud are no longer under our control. Bandwidth isn’t
the limiting factor in most integration scenarios, round trip latency is. On
the LAN, we can optimize. In the cloud, we lose that ability, and have to live
with longer latency in combination with SLA’s from multiple vendors.
These performance constraints change the integration model. Integration can no
longer depend on high performance local access - anyone that does complex
analysis on, say, SalesForce leads, or anyone trying to pull SaaS data into
a warehouse will tell you performance is already an issue.
What this means for integration
There are a lot of implications as a result of the shift to a cloud model, and
I originally hoped we could get into the deeper issues during CloudCamp, but
time wasn’t on our side.
Cloud vendors understand that integration is an issue, particularly at the
application and platform level. We are increasingly seeing both
integration as a service, and integration as part of the platform.
The question is what form will that integration capability take ? Will it be
open and standards based like Open Social? Or will it be vendor and platform specific ?
Theres a lot at stake here,
because there a strong possibility that integration is becoming part of the
vendor lock-in strategy.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=202
What kind of service guarantees can I expect from PaaS?
Posted 14 June, 2008 by
Chris in
PaaS, web services
1 Comment
Peter Laird wrote up a great analysis on Terms of Service for SaaS providers and it got me wondering: What about the emerging Platform as a Service providers?
They’ve got an especially difficult challenge because they’re running other people’s code. How do you ensure availability when it’s not your code? Google’s App Engine conspicuously avoids the problem by not providing any SLA.
Generally speaking, there seems be three categories of providers:
Infrastructure, managed hosting and run time environments: OpSource, Etelos, Joyent, AWS, GAE, etc
Cloud IDEs: Bungee Labs, etc.
App Builders: CogHead, LongJump, etc.
Obviously, the closer you get to bare metal, the harder it is for you to provide service guarantees.
I’ve already noted that AWS provides for 99.9% in their SLA, but that’s only for the service itself. Nowhere in that document does it say anything about your image’s uptime. Seems that you are on your own entirely.
This is a tricky problem even for the app builders. They all support some kind of scripting environment and you can only sandbox so much. Not to pick on CogHead, but I found this post on their forum. So, as you can see, these problems are going to show up everywhere.
I’d love to hear from anyone that has more details on SLA from platform providers.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=201
Hi-fiving failure and a lesson on how not to treat your users…
Posted 10 June, 2008 by
Chris in
Cloud
2 Comments
I saw today via Techmeme that Twitter was excited and proud that they were able to achieve 97.3% uptime during the Apple WWDC keynote yesterday.
If I were them, I’d be a more humble and a little more circumspect. Reading thought he comments, you’d think they just landed a man on the moon.
First, let me say that under many circumstances achieving 97.3% availability is grounds for termination. Most Enterprise SLAs specify 99.9% or more with service credits applied for failure. Amazon’s SLA provides for 99.9% with 10% credited back if they fall below that and 25% credit below 99%.
Salesforce.com had some serious trouble with availability a while back and people were legitimately wondering if they would survive the crisis. Today, they make a provision for these SLA failure expenses, and so far, have been lucky enough (i.e. smart enough) not to have to make any payments.
Just to put that in perspective, 99.9% uptime translates to about 44 minutes downtime per month (99%, about 440). So, at 97.3% for the (roughly) four hours of peak time usage during the keynote the were unavailable for about 6.5 mins. or nearly 15% of the downtime budget for the month.
This isn’t’ something I would be proud of.
Second, your users are not your QA or test engineering department. They claim:
…we learned a lot during this stress test and that will translate to better performance down the line.
Finally, turning off features to support peak loads is treating the symptoms, not the underlying problem.
Is it any wonder their site is as unreliable as it is? With this kind of attitude, I don’t think things are going to materially change anytime soon.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=200
Integration remains the number one inhibitor to adopting SaaS…
Posted 9 June, 2008 by
Chris in
SnapLogic, SOA, Open Source
No Comments
CIO magizine reports on a recent survey by Forrester indicates that integration remains the number one inhibitor for Enterprises adopting SaaS.
This comes as no surprise to me.
We’ve been talking to all the large hosted application providers and they all say the same thing: My subscribers have data inside their organization and they need to get it out. The larger the company, the more likely they need to integrate their SaaS application with behind the firewall data and other applications. When integrating that data becomes more difficult (or less secure, or available) the extra integration complexity is traded off against the simplicity of a SaaS alternative. Once the complexity gets too large, it can quickly overcome the benefits of SaaS.
Integrating SaaS with internal apps remains problematic since access generally requires software that runs inside the organization to manipulate and transform the data and to initiate integration with the SaaS apps for higher security.
There are a number of hosted integration alternatives emerging, but few adequately address the behind the firewall issue. The ones do, suffer from either being too complex for the SaaS marketplace, or too limited in function to handle the task.
We here at SnapLogic have been working on the problem for a while now. We think we’ve got the right approach to this problem. We’ve got open source software that you can download and run wherever the problem lives. Our distributed approach allows you to apply the necessary access and/or transformation functions wherever it’s needed, and deploy in on top of your existing infrastructure, weather that in your data center, in the DMZ, or in a co-location facility. And since it works just like the web so you already know how it works to keep it running fast and secure.
Of all the possible deployment scenarios for SnapLogic, integration SaaS applications with enterprise apps is by far the most common.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=199
« Previous
Posted 27 June, 2008 by mikeyp in Cloud, SOA, ESB, EAI, ETL
1 Comment
I went to the excellent CloudCamp gathering in San Francisco Tuesday night. CloudCamp was organized by Reuven Cohen, Jesse Silver, Dave Nielsen, and others, and there were about 350 people there (SnapLogic also sponsored). The timing was good, since it overlapped with a number of other conferences, which attracted a lot of folks from out of town. Overall, I declare this a success, and I’m looking forward to future CloudCamp events.
Cloud Computing is a broad concept, and that was reinforced by the nunber of talks and discussions that seemed to always converge on the topic of ‘What is cloud computing ?’
With an evolving concept like cloud computing, it’s hard to really nail down a definition of what it is, since there really isn’t an it yet. (It reminds me of the early days of the Internet, and the early days of the World Wide Web…how could anyone put a definition on either of those in the early days?)
I’m not going to even attempt to define cloud computing. I’m simply treating it as the general trend toward virtualization of computing resources. (Others have done a good job of clarifying the space. See Peter Laird’s post on the cloud market, or Reuven Cohen’s description.)
As of now, the cloud space can be divided into several broad market areas, primarily:
- Software as a Service (Saas)
- Platform as a Services (PaaS)
- Infrastructure as a Service (IaaS)
There is a lot of current activity in all of these areas among vendors and adopters. However, cloud computing will not happen overnight - it will be a gradual transition and, current hype aside, that transition is ongoing in all of these areas.
In the meantime, there are still a lot of existing software applications that will not move to the ‘cloud’ any time soon. Looking at the trends, this raises a lot of issues about how we are going to integrate all of these into a cohesive functional unit. As a result, I chaired a session at CloudCamp to promote some discussion on this topic, and we had 25 or 30 people in the room.
The session didn’t go as smoothly as I hoped, since there was one dissenting voice, who insisted the solution to all these problems was local storage on the client side. That aside, we did cover some good topics I’m summarizing here, and adding some of my other thoughts.
The integration problem
Integration is a very broad term, that encompasses multiple levels, from low level data interfaces to workflow and higher level business process automation. Despite the various levels and interests in the room, there was general consensus that integration is a real issue with cloud computing.
Some participants pointed that we haven’t fully solved the integration problem locally. Despite the vision of SOA, and the existence of messaging, there’s actually a lot of integration done today with Duct Tape and Paper Clips (or, in hard currency, Perl and PL/SQL.) The larger enterprises have the resources and skills to build their own integration capability, but smaller business (the earliest adopters of SaaS), simply don’t have the infrastructure in place. 65% of the enterprises surveyed by Forrested in late 2007 indicated integration issues as their reason for not considering SaaS.
The reality is that integration matrix is still too complex, with it’s permutation of protocols, access methods, and data formats.
Ray Wang of Forrester also notes this integration barrier is ‘more fallacy than fact’, although I don’t agree with that latter statement - cloud raises some significant new integration challenges, and they shouldn’t be underestimated.
Why the cloud makes integration harder
If we have difficulty with local integration, the cloud wave only makes integration more difficult. There are a number of reasons why:
- New integration scenarios
- Access to the cloud may be limited
- Dynamic resources
- Performance
New integration scenarios
Before the cloud model, we had to tie local systems together. With the shift to a cloud model, we now have to connect local applications to the cloud, and we also have to connect cloud applications to each other, which adds new permutations to the matrix.
Its unlikely that everything will move to a cloud model all at once, so even the simplest scenarios require some form of local / remote integration. It’s also likely that we will have applications that never leave the building, due to regulatory constraints like HIPPA, GLBA, and general security and NPPI issues. All of this means integration must cross a firewall somwehere.
Cloud to Cloud also raises issues. Do we rely on the (competing) vendors to provide integration with each other ? Where is the integration hosted ? Does the integration live in the cloud as well? And if so, how does it connect to those local applications ?
Access may be limited
Access to cloud resources, either SaaS, PaaS, or pure infrastructure, is more limited than local applications.
With local applications, we usually had complete access to the application, even when there were no good integration points in the original design. With custom applications, adding integration hooks was possible. Even with commercial applications, it was always possible to slip in database triggers to raise events and provide hooks for integration access.
Once applications move to the cloud, custom applications must be designed to support integration because we no longer have that low level of access. For SaaS, we are dependent on the vendor to provide the integration hooks and API’s. For example, the SalesForce.com Web Services API doesn’t support transactions against multiple records, which means integration code has to handle that logic. For PaaS, the platform might support integration for aplications on the platform. Platform to Platform is still an open question.
Some would argue that a limited set of APIs will improve the situation, since backdoor access is what led integration into the ‘Duct Tape’ mess in the first place. They have a valid point. But those API’s must be able to handle the integration required, and the popularity of BeautifulSoup tells me screen scraping as a workaround is alive and well.
Dynamic Resources
The true cloud model abstracts away most of our notions of physical hardware, and everything becomes a service in some form. This is one of the benefits of cloud, but it also means integration models change. In a world where applications and infrastructure move and change dynamically, traditional notions of tightly coupled integration are no longer valid. Add to this the issues of application versioning (no longer under our control in a SaaS environment), or PaaS platform changes (also no longer under our control) and tight coupling becomes a dead end.
It clear from the SaaS vendors that the Web is the way to go when it comes to client access. It seems that lower level interfaces will follow the same REST route.
Performance
Cloud or not, we still can’t get away from physical limitations. Although we may see better application scalability and performance, the network distances between elements in the cloud are no longer under our control. Bandwidth isn’t the limiting factor in most integration scenarios, round trip latency is. On the LAN, we can optimize. In the cloud, we lose that ability, and have to live with longer latency in combination with SLA’s from multiple vendors.
These performance constraints change the integration model. Integration can no longer depend on high performance local access - anyone that does complex analysis on, say, SalesForce leads, or anyone trying to pull SaaS data into a warehouse will tell you performance is already an issue.
What this means for integration
There are a lot of implications as a result of the shift to a cloud model, and I originally hoped we could get into the deeper issues during CloudCamp, but time wasn’t on our side.
Cloud vendors understand that integration is an issue, particularly at the application and platform level. We are increasingly seeing both integration as a service, and integration as part of the platform.
The question is what form will that integration capability take ? Will it be open and standards based like Open Social? Or will it be vendor and platform specific ?
Theres a lot at stake here, because there a strong possibility that integration is becoming part of the vendor lock-in strategy.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=202
What kind of service guarantees can I expect from PaaS?
Posted 14 June, 2008 by
Chris in
PaaS, web services
1 Comment
Peter Laird wrote up a great analysis on Terms of Service for SaaS providers and it got me wondering: What about the emerging Platform as a Service providers?
They’ve got an especially difficult challenge because they’re running other people’s code. How do you ensure availability when it’s not your code? Google’s App Engine conspicuously avoids the problem by not providing any SLA.
Generally speaking, there seems be three categories of providers:
Infrastructure, managed hosting and run time environments: OpSource, Etelos, Joyent, AWS, GAE, etc
Cloud IDEs: Bungee Labs, etc.
App Builders: CogHead, LongJump, etc.
Obviously, the closer you get to bare metal, the harder it is for you to provide service guarantees.
I’ve already noted that AWS provides for 99.9% in their SLA, but that’s only for the service itself. Nowhere in that document does it say anything about your image’s uptime. Seems that you are on your own entirely.
This is a tricky problem even for the app builders. They all support some kind of scripting environment and you can only sandbox so much. Not to pick on CogHead, but I found this post on their forum. So, as you can see, these problems are going to show up everywhere.
I’d love to hear from anyone that has more details on SLA from platform providers.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=201
Hi-fiving failure and a lesson on how not to treat your users…
Posted 10 June, 2008 by
Chris in
Cloud
2 Comments
I saw today via Techmeme that Twitter was excited and proud that they were able to achieve 97.3% uptime during the Apple WWDC keynote yesterday.
If I were them, I’d be a more humble and a little more circumspect. Reading thought he comments, you’d think they just landed a man on the moon.
First, let me say that under many circumstances achieving 97.3% availability is grounds for termination. Most Enterprise SLAs specify 99.9% or more with service credits applied for failure. Amazon’s SLA provides for 99.9% with 10% credited back if they fall below that and 25% credit below 99%.
Salesforce.com had some serious trouble with availability a while back and people were legitimately wondering if they would survive the crisis. Today, they make a provision for these SLA failure expenses, and so far, have been lucky enough (i.e. smart enough) not to have to make any payments.
Just to put that in perspective, 99.9% uptime translates to about 44 minutes downtime per month (99%, about 440). So, at 97.3% for the (roughly) four hours of peak time usage during the keynote the were unavailable for about 6.5 mins. or nearly 15% of the downtime budget for the month.
This isn’t’ something I would be proud of.
Second, your users are not your QA or test engineering department. They claim:
…we learned a lot during this stress test and that will translate to better performance down the line.
Finally, turning off features to support peak loads is treating the symptoms, not the underlying problem.
Is it any wonder their site is as unreliable as it is? With this kind of attitude, I don’t think things are going to materially change anytime soon.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=200
Integration remains the number one inhibitor to adopting SaaS…
Posted 9 June, 2008 by
Chris in
SnapLogic, SOA, Open Source
No Comments
CIO magizine reports on a recent survey by Forrester indicates that integration remains the number one inhibitor for Enterprises adopting SaaS.
This comes as no surprise to me.
We’ve been talking to all the large hosted application providers and they all say the same thing: My subscribers have data inside their organization and they need to get it out. The larger the company, the more likely they need to integrate their SaaS application with behind the firewall data and other applications. When integrating that data becomes more difficult (or less secure, or available) the extra integration complexity is traded off against the simplicity of a SaaS alternative. Once the complexity gets too large, it can quickly overcome the benefits of SaaS.
Integrating SaaS with internal apps remains problematic since access generally requires software that runs inside the organization to manipulate and transform the data and to initiate integration with the SaaS apps for higher security.
There are a number of hosted integration alternatives emerging, but few adequately address the behind the firewall issue. The ones do, suffer from either being too complex for the SaaS marketplace, or too limited in function to handle the task.
We here at SnapLogic have been working on the problem for a while now. We think we’ve got the right approach to this problem. We’ve got open source software that you can download and run wherever the problem lives. Our distributed approach allows you to apply the necessary access and/or transformation functions wherever it’s needed, and deploy in on top of your existing infrastructure, weather that in your data center, in the DMZ, or in a co-location facility. And since it works just like the web so you already know how it works to keep it running fast and secure.
Of all the possible deployment scenarios for SnapLogic, integration SaaS applications with enterprise apps is by far the most common.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=199
« Previous
Posted 14 June, 2008 by Chris in PaaS, web services
1 Comment
Peter Laird wrote up a great analysis on Terms of Service for SaaS providers and it got me wondering: What about the emerging Platform as a Service providers?
They’ve got an especially difficult challenge because they’re running other people’s code. How do you ensure availability when it’s not your code? Google’s App Engine conspicuously avoids the problem by not providing any SLA.
Generally speaking, there seems be three categories of providers:
Infrastructure, managed hosting and run time environments: OpSource, Etelos, Joyent, AWS, GAE, etc Cloud IDEs: Bungee Labs, etc. App Builders: CogHead, LongJump, etc.
Obviously, the closer you get to bare metal, the harder it is for you to provide service guarantees.
I’ve already noted that AWS provides for 99.9% in their SLA, but that’s only for the service itself. Nowhere in that document does it say anything about your image’s uptime. Seems that you are on your own entirely.
This is a tricky problem even for the app builders. They all support some kind of scripting environment and you can only sandbox so much. Not to pick on CogHead, but I found this post on their forum. So, as you can see, these problems are going to show up everywhere.
I’d love to hear from anyone that has more details on SLA from platform providers.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=201
Hi-fiving failure and a lesson on how not to treat your users…
Posted 10 June, 2008 by
Chris in
Cloud
2 Comments
I saw today via Techmeme that Twitter was excited and proud that they were able to achieve 97.3% uptime during the Apple WWDC keynote yesterday.
If I were them, I’d be a more humble and a little more circumspect. Reading thought he comments, you’d think they just landed a man on the moon.
First, let me say that under many circumstances achieving 97.3% availability is grounds for termination. Most Enterprise SLAs specify 99.9% or more with service credits applied for failure. Amazon’s SLA provides for 99.9% with 10% credited back if they fall below that and 25% credit below 99%.
Salesforce.com had some serious trouble with availability a while back and people were legitimately wondering if they would survive the crisis. Today, they make a provision for these SLA failure expenses, and so far, have been lucky enough (i.e. smart enough) not to have to make any payments.
Just to put that in perspective, 99.9% uptime translates to about 44 minutes downtime per month (99%, about 440). So, at 97.3% for the (roughly) four hours of peak time usage during the keynote the were unavailable for about 6.5 mins. or nearly 15% of the downtime budget for the month.
This isn’t’ something I would be proud of.
Second, your users are not your QA or test engineering department. They claim:
…we learned a lot during this stress test and that will translate to better performance down the line.
Finally, turning off features to support peak loads is treating the symptoms, not the underlying problem.
Is it any wonder their site is as unreliable as it is? With this kind of attitude, I don’t think things are going to materially change anytime soon.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=200
Integration remains the number one inhibitor to adopting SaaS…
Posted 9 June, 2008 by
Chris in
SnapLogic, SOA, Open Source
No Comments
CIO magizine reports on a recent survey by Forrester indicates that integration remains the number one inhibitor for Enterprises adopting SaaS.
This comes as no surprise to me.
We’ve been talking to all the large hosted application providers and they all say the same thing: My subscribers have data inside their organization and they need to get it out. The larger the company, the more likely they need to integrate their SaaS application with behind the firewall data and other applications. When integrating that data becomes more difficult (or less secure, or available) the extra integration complexity is traded off against the simplicity of a SaaS alternative. Once the complexity gets too large, it can quickly overcome the benefits of SaaS.
Integrating SaaS with internal apps remains problematic since access generally requires software that runs inside the organization to manipulate and transform the data and to initiate integration with the SaaS apps for higher security.
There are a number of hosted integration alternatives emerging, but few adequately address the behind the firewall issue. The ones do, suffer from either being too complex for the SaaS marketplace, or too limited in function to handle the task.
We here at SnapLogic have been working on the problem for a while now. We think we’ve got the right approach to this problem. We’ve got open source software that you can download and run wherever the problem lives. Our distributed approach allows you to apply the necessary access and/or transformation functions wherever it’s needed, and deploy in on top of your existing infrastructure, weather that in your data center, in the DMZ, or in a co-location facility. And since it works just like the web so you already know how it works to keep it running fast and secure.
Of all the possible deployment scenarios for SnapLogic, integration SaaS applications with enterprise apps is by far the most common.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=199
« Previous
Posted 10 June, 2008 by Chris in Cloud
2 Comments
I saw today via Techmeme that Twitter was excited and proud that they were able to achieve 97.3% uptime during the Apple WWDC keynote yesterday.
If I were them, I’d be a more humble and a little more circumspect. Reading thought he comments, you’d think they just landed a man on the moon.
First, let me say that under many circumstances achieving 97.3% availability is grounds for termination. Most Enterprise SLAs specify 99.9% or more with service credits applied for failure. Amazon’s SLA provides for 99.9% with 10% credited back if they fall below that and 25% credit below 99%.
Salesforce.com had some serious trouble with availability a while back and people were legitimately wondering if they would survive the crisis. Today, they make a provision for these SLA failure expenses, and so far, have been lucky enough (i.e. smart enough) not to have to make any payments.
Just to put that in perspective, 99.9% uptime translates to about 44 minutes downtime per month (99%, about 440). So, at 97.3% for the (roughly) four hours of peak time usage during the keynote the were unavailable for about 6.5 mins. or nearly 15% of the downtime budget for the month.
This isn’t’ something I would be proud of.
Second, your users are not your QA or test engineering department. They claim:
…we learned a lot during this stress test and that will translate to better performance down the line.
Finally, turning off features to support peak loads is treating the symptoms, not the underlying problem.
Is it any wonder their site is as unreliable as it is? With this kind of attitude, I don’t think things are going to materially change anytime soon.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=200
Integration remains the number one inhibitor to adopting SaaS…
Posted 9 June, 2008 by
Chris in
SnapLogic, SOA, Open Source
No Comments
CIO magizine reports on a recent survey by Forrester indicates that integration remains the number one inhibitor for Enterprises adopting SaaS.
This comes as no surprise to me.
We’ve been talking to all the large hosted application providers and they all say the same thing: My subscribers have data inside their organization and they need to get it out. The larger the company, the more likely they need to integrate their SaaS application with behind the firewall data and other applications. When integrating that data becomes more difficult (or less secure, or available) the extra integration complexity is traded off against the simplicity of a SaaS alternative. Once the complexity gets too large, it can quickly overcome the benefits of SaaS.
Integrating SaaS with internal apps remains problematic since access generally requires software that runs inside the organization to manipulate and transform the data and to initiate integration with the SaaS apps for higher security.
There are a number of hosted integration alternatives emerging, but few adequately address the behind the firewall issue. The ones do, suffer from either being too complex for the SaaS marketplace, or too limited in function to handle the task.
We here at SnapLogic have been working on the problem for a while now. We think we’ve got the right approach to this problem. We’ve got open source software that you can download and run wherever the problem lives. Our distributed approach allows you to apply the necessary access and/or transformation functions wherever it’s needed, and deploy in on top of your existing infrastructure, weather that in your data center, in the DMZ, or in a co-location facility. And since it works just like the web so you already know how it works to keep it running fast and secure.
Of all the possible deployment scenarios for SnapLogic, integration SaaS applications with enterprise apps is by far the most common.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=199
« Previous
Posted 9 June, 2008 by Chris in SnapLogic, SOA, Open Source
No Comments
CIO magizine reports on a recent survey by Forrester indicates that integration remains the number one inhibitor for Enterprises adopting SaaS.
This comes as no surprise to me.
We’ve been talking to all the large hosted application providers and they all say the same thing: My subscribers have data inside their organization and they need to get it out. The larger the company, the more likely they need to integrate their SaaS application with behind the firewall data and other applications. When integrating that data becomes more difficult (or less secure, or available) the extra integration complexity is traded off against the simplicity of a SaaS alternative. Once the complexity gets too large, it can quickly overcome the benefits of SaaS.
Integrating SaaS with internal apps remains problematic since access generally requires software that runs inside the organization to manipulate and transform the data and to initiate integration with the SaaS apps for higher security.
There are a number of hosted integration alternatives emerging, but few adequately address the behind the firewall issue. The ones do, suffer from either being too complex for the SaaS marketplace, or too limited in function to handle the task.
We here at SnapLogic have been working on the problem for a while now. We think we’ve got the right approach to this problem. We’ve got open source software that you can download and run wherever the problem lives. Our distributed approach allows you to apply the necessary access and/or transformation functions wherever it’s needed, and deploy in on top of your existing infrastructure, weather that in your data center, in the DMZ, or in a co-location facility. And since it works just like the web so you already know how it works to keep it running fast and secure.
Of all the possible deployment scenarios for SnapLogic, integration SaaS applications with enterprise apps is by far the most common.
Trackback URL : http://blog.snaplogic.org/wp-trackback.php?p=199


