Living in the Compute Cloud – Web 2.0 Expo Berlin

Pubblicato il 22/10/2008 da Antonio Volpon

Your site can have a lot of traffic, for many different reasons. Apart from that, your site can experience peaks of traffic.

To deal with this you can build your own infrastructures, but today there are other solutions available, such as the ones provided by Amazon and by Google.

Amazon web services

They are several platforms:

s3 is used for storage
ec2 is an on demand virtual server controlled with web service api (you can use your favourite linux distribution). It provides Acl for port control, you can choose datacenter (currently only in the US), and do a snapshot backup to s3
simpledb is a hash-like database that store items with attribute/value pairs. It is meant for small items, organized into domains, redundant and distributed, has no schema, in it everything is a strin, it allows to use list values, you use sql-like queries to retrieve data

Google apps engine

With this solutions you run your application directly on the Google infrastructure. There is no concept of hardware – you just deploy an application. For the moment it’s limited to python and for sure it has not the same flexibility of the Amazon Solutions. As a compensation for not having access to low level sockets you can use memcache, image, email, url fetch, google auth and users. The platform is limiting but takes care of scaling problems.

Bigtable is Google solution for database. It is very similar to simple db (no schema, list values) but also very different (data type support, references and multiple tables, blob files (1mb)). What is very limiting is that results can only last for a couple of second, after that they are killed by the system. On the other hand it is very easy to use. In few words, you have to accept the limitations.

With Google Apps Engine you have no background jobs, no possibility to backup/snapshot data, emails can only be sent from google accounts and it’s restricted to pure-python libraries and given apis

Considerations and usage suggestions

The impression taken from this session is that we need to use a lot of tricks to proficiently use these tools, even Amazon. The speaker illustrated some case such as uploading users data with authentication.

If the application I developed needs extra capacity for an unknown period of time with Amazon ec2 is quite easy to start additional instances. It’s a matter of using a time base systems, such as cron (amazon)

If the need is for something that is load balanced a possible solution is to itegrate ec2 usage with some monitoring tool, such as Monit. With these tools I can monitor if the load is too high and eventually add new instances. Monitoring for these solutions is the hard part to do because there is no ready solution for it

Even if the site has its own infrastructure that works it’s possible, if neededn, add extra capacity connecting to ec2, so to combine the best of both worlds. However ec2 is not available in Europe at the moment and so there could be latency problems.

Real life use cases of these platforms:

googbad.me
dawanda.com
g.ho.st

Final thoughts

get accustomed to eventual consistency (not sure that queries of few milliseonds are updated in all instances)
be prepared to leave relational database
many miss strong SLAs – most of the time u can live fine without
hardware is a commodity – only specialize in it if it really necessary

Jonathan Weiss
A Ruby consultant and partner at Peritor Wissensmanagement GmbH in Berlin, Germany. For the last years he has been developing and consulting large Ruby on Rails projects where he focused on Scalability and Security. He is an active member of the Ruby and Rails community and is the developer of the Open Source deployment tool Webistrano. In his spare time he maintains Rubygems and Rails in the FreeBSD Ports system.

Living In The Compute Cloud

View SlideShare presentation or Upload your own. (tags: web2expoeu08 bigtable)

RIA and Ajax Security Workshop – Web 2.0 Expo Berlin

Pubblicato il 21/10/2008 da Antonio Volpon

A very interesting and informative talk dealing with the new types of attacks that affect web 2.0 applications and RIA in particular.

The session was divided in 2 parts, the first about AJAX and the last about Rich Internet Applications.

The slides of this talk are available on slideshare and are impressive for their completeness. Not only they provide detailed examples for every case illustrated, but they link to a series of articles and web resources.

The main problem of this talk is that it’s quite impossible to be able to be specific enough and, at the same time, don’t get too much into details. This resulted in some hard-to-understand parts.

AJAX

In general attacking an AJAX application is more difficult compared to a web 1.0 site. But on the other hand is more difficult to protect an AJAX application because there are more ways to exploit it and new ways are discovered every day.

Not all “web 2.0” sites use new technologies (such as Youtube and MySpace)
A single page in Myspace has a lots of includes.
Also Google Maps has a lot of includes, but ofJavascript code. Google code can be potentially insecure

Why care about web 2.0 security

People changed how they interact with web sites (they erase privacy barriers and they don’t feel the distance. The are the new generations)
Technologies spread from innovators to traditionalists (today AJAX in financial institutions, health care, government) – mainstream
Bugs are affecting people now

Discovery and method manipulation

Playing with parameters is still an excellent web attack (asking application to do the work for you). As business logic gets more complex, so do parameters vulnerabilities
Figuring out web apps is tough part of pen-test

Two types of Ajax apps

client-server proxy (equivalent to SOAP, client hides javascript)
client-side rendering (we can see the javascript and know what it does)

Cross Site Scripting

Downstream communication methods are much more complicated
User controlled data might be contained in arguments in dynamically created javascript, contained in Javascript arrays, etc. As a result, attack and defence is more difficult

Four bugs

downstream JS Arrays. Dangerous characters
XSS payload can be tucked into many places
XSS might already be in the dom (document.url, document.location, document.referer).
AJAX uses “backend” requests never expected to be seen directly in browser

RIA

Is ill-defined. Many contain many terms, AJAX, Flash, offline mode, decoupling from the browser. There is a huge disparity in features and security design.

Why use RIA

to increase responsiveness
desktop integration
to write full desktop apps

RIA Frameworks

No one framework is without limits and security problems. The worst seems Adobe Air because it shows all the limits of the very old ActiveX model.

The frameworks:

Adobe AIR
Microsoft Silverlight
Google Gears
Mozilla Prism

Adobe Air

Full-featured
Cross-browser, cross-platform
Created with Flex, Flash
Can be invoked by browser with arguments, like ActiveX or Flash
Air is best thought as ActiveX than Flash ++ (code runs with full privileges and can install malware)
SWF files can import functionalities that allows them to interact with AIR applications
SWF files can check install status and version
By default, code included in AIR application has full rights
There is not a “code access security” model such as in Java or .Net
AIR has many ways of loading executable content to run, such as HTML/JS and SWF
AIR applications can be bundled as binaries
Problems: allowing users to install signed applets is dangerous. Allowing self-signed is terrifying
Some suggestions to adobe: change default action, disable unsigned install prompts

Silverlight

Lot of sensibility toward security

Is the Microsoft Flash equivalent
Cross browser and cross platform
Subnet of the .NET frameworks
The security model is based on .NET
Calling system primitives the system will fail. You need to isolate it
What could go wrong (threading, DoS attacks against local system)

Google Gears

Has SQLite embedded
Uses an homegrown API for synchronizing data
Has a LocalServer
Works offline via SQL database, local assets and a local app server
Uses some origin to restrict access to site databases and LocalServer resource capture
Provides for parametrized SQL
Unfortunately they allows personalization of opt-in screen

Yahoo! Browserplus

A very bad idea
Runs as a browser plugin, with a separate helper process
It’s very similar to ActiveX concepts
Use old version or Ruby. Perfectly safe as long as you don’t use strings and arrays

Mozilla Prism

Wraps webapps to appears as desktop apps
Standalone browser instance
Problem: the Javascript included with webapps has full XPCOM privileges (but no content scripting privileges)
Problem: the sandbox isn’t real

HTML 5

HTML introduces some new concepts related to storage of informations.

Introduces DOM storage (sessionStorage, localStorage, database storage)
The major goals are more storage space and real persistence, because cookies are considered too small and users delete cookies or won’t accept them
This method bypasses pesky users, that however can use a specific about:config directive

Browser based SQL Databases

Injection becomes far more damaging (because of lot of privileges)

Checklist

prevent predictability named data stores
parametrize sql statements

Summary

RIA frameworks widely vary in their security models
It is highly likely that web developers will introduce interesting flaws into their desktop applications

Alex Stamos is a Founding Partner of iSEC Partners, Inc, a strategic digital security organization. Alex is an experienced security engineer and consultant specializing in application security and securing large infrastructures, and has taught multiple classes in network and application security. He is a leading researcher in the field of web application and web services security and has been a featured speaker at top industry conferences such as Black Hat, CanSecWest, DefCon, SyScan, Microsoft BlueHat and OWASP App Sec. He is a contributing author of “Hacking Exposed: Web 2.0” and holds a BSEE from the University of California, Berkeley.

RIA And AJAX Security Workshop, Part 1

View SlideShare presentation or Upload your own. (tags: 2.0 web)

RIA And AJAX Security Workshop, Part 2

View SlideShare presentation or Upload your own. (tags: firefox 5)

I siti pigri sono i più veloci

Pubblicato il 15/10/2008 da Antonio Volpon

Ho messo da parte in questi anni un bel po’ di materiale e documentazione relativi alla performance e ottimizzazione dei siti web, sia per quanto riguarda il cosiddetto lato server, sia per quello che viene chiamato front end.

Verrà – spero presto – il momento di compilare un elenco ragionato di tutte queste risorse (potete farvene un’idea visitando la sezione optimization del mio delicious), ma ora mi limito a citare un articolo che propone in modo molto chiaro uno dei nodi fondamentali da affrontare. Si tratta di Lazy web sites run faster scritto da Gojko Adzic.

Per aumentare le performance dei siti potete investire sull’hardware, quindi più processori e sistemi più veloci, migliore connettività, infrastruttura moderna. Vi accorgerete però che anche così facendo il web server fatica, per architettura, a gestire un sito il cui codice non sia ottimizzato.

Potete allora dedicarvi alla riscrittura (o refactoring) del codice per renderlo più veloce. Anche qui però arriverete ben presto a un limite.

Il segreto, secondo Gojko, sta invece nella progettazione di un sistema che si preoccupi di

delegare le operazioni più complesse a processi che girano in background;
non comunicare con sistemi esterni in modo sincrono, non importa quanto velocemente;
essere pigro: meglio lasciare per dopo tutto quello che non ha necessità di essere eseguito al momento.

Aggiungerei anche di eliminare le elaborazioni inutili, come per esempio l’esecuzione a ogni richiesta di interrogazioni esose (come quelle verso i database) per contenuti che non cambiano quasi mai. In questo caso potrebbe essere interessante sperimentare qualche meccanismo di caching.

Se ripenso ai colli di bottiglia dei progetti che ho visto da vicino, la maggior parte poteva essere evitata rimandando operazioni non immediatamente essenziali, come per esempio:

l’invio di un messaggio di posta elettronica di conferma;
la trasformazione di file (soprattutto in formato XML);
la comunicazione con sistemi di gestione;
il calcolo di statistiche.

Capita di trovare anche online degli esempi che fanno riflettere. Ogni volta che utilizzo la funzione “all time” di Feedburner per analizzare il traffico complessivo dei miei feed mi trovo ad aspettare almeno una decina di secondi. Probabilmente il sistema sta elaborando il consuntivo in tempo reale, quando avrebbe potuto farlo a priori. Non c’è nulla di male ad aspettare anche se a volte, per carico del server, viene restituito un timeout. Forse non proprio il modo ideale per gestire questa funzionalità, anche se utilizzata da una minoranza.

User Experience e Web Project Management in pillole

Archivi categoria: Tecnica