Mammoth
  • |
  • Contact
  • |
CMD | Command Prompt, Inc. - PostgreSQL Solutions, Support & Hosting
  • |
  • |
  • |
  • |
  • |
FOSSExperts, day 2
Posted Wednesday Jul 28th, 2010 09:35am
by Joshua Drake
| Permalink

I expected feedback from the community on FOSSExperts. I did not expect feedback with such immediacy. All the feedback I have received so far is positive. Which is a great feeling. Here are the key points that are coming back.

How do deal with disagreement about the deliverable:

This is an interesting one. I wanted to keep FOSSExperts simple. That is why the deliverable on the ALTER TABLE project is simple, committed to PostgreSQL Core.

That may not work in all circumstances. So I am considering one of two options. The first option is courtesy of Josh Berkus. The idea would be to have a board of people that determine whether or not the deliverable has been met. This has merit because you have a panel of experts that make the determination. It can make the review process painless but is also takes the power out of both the funders and developers.

The second one is to take a vote. It would work something like this.


  • Developer states project is complete and demonstrates completeness based on the deliverable.
  • Every person who funded the project votes on whether or not the project is complete.

    • If 66% vote the project complete, developer gets their money.
    • If less than 66% vote project incomplete, developer doesn't get their money.

    There would have to be some caveats. First the funders need to be able to communicate with the developer because they may not have understood part of the spec. Further as we are working with Open Source, the end deliverable may have been changed based on the will of the community versus the developer (see the Hot Standby work with PostgreSQL).

    The voting would also need to be limited to a period of time. I was thinking 14 days. The 66% would be tallied against those that voted in that 14 days.

    I like this idea because it removes the third party and it stops a single funder from calling foul as they are part of a collective vote. What do you think?


Other than that the feedback has been extremely positive. I even posted to the LedgerSMB list and multiple people are excited to see this opportunity.

If you have ideas, please share them. We have setup a flame page just for this.

Remember a lot of your questions can be answered in the FAQ as well.


Categories: Business, OpenSource, PostgreSQL, Python, SQL

FOSSExperts, a new way to fund Open Source (Beta)
Posted Tuesday Jul 27th, 2010 03:37pm
by Joshua Drake
| Permalink

The cat is out of the proverbial bag. I originally planned to have a quiet roll out with a few close contributors but that has gone by the wayside. Now I am going to be pushing hard for people to test, beat on, object to, argue about, flame upon, scream at, praise and hopefully help us build out something that is truly useful for the FOSS Community. What am I blathering on about? FOSSExperts of course.

FOSSExperts is a new site specifically engineered to allow FOSS developers to raise money for projects they are trying to develop. The idea stemmed from the very cool Kickstarter. With our focus obviously being on a different kind of creative.

This is long overdue in the FOSS Community. There are a great deal of communities out there (LedgerSMB for example) that can use a place for their developers to try and raise funds for a specific feature. LedgerSMB just recently had a discussion on developing a Payroll module. Developing a Payroll module will be expensive for a single small company to absorb, but 20 small companies? Not nearly as expensive.

What FOSSExperts is not, is a place to send money to global projects such as Debian or PostgreSQL.org. It is for specific, well defined proposals and has a specific and defined delivery as well as refund policies etc.

Right now, we are in closed Beta. If you have a project or proposal you would like to try out you need to email me directly but we are interested. So take a look, and let the rage begin! If you like, you can review one of the larger proposals already on the site as well.


Categories: Business, OpenSource, PostgreSQL, Python, SQL

A better backup with PostgreSQL using pg_dump
Posted Friday Jul 23rd, 2010 11:23am
by Joshua Drake
| Permalink

This is generously borrowed from the PostgreSQL Docs, and updated to something that represents a modern approach to PostgreSQL backups. This documentation has always bothered me because it should have been re-written years ago. Yes I plan on submitting a more comprehensive version as a patch but I don't have time to push it into DocBook right now. If someone else wants to grab it, please do.

Yes, I really do believe the use of plain text backups is a mistake. Yes I realize PostgreSQL has the limitation of not being able to backup the cluster in anyway but plain text.

The standard for portable backups with PostgreSQL is pg_dump and pg_dumpall. When used properly pg_dump will create a portable and highly customizable backup file that can be used to restore all or part of a single database.

The pg_dump application acts as a standard PostgreSQL client. This means that you can perform this backup procedure from any remote host that has access to the database. You do not need to be a super user to use pg_dump but you must have read (and EXECUTE for functions) access to every object within the database.

Backups created by pg_dump are internally consistent, meaning, the dump represents a snapshot of the database at the time pg_dump began running. The backup will not block other operations on the database while it is working. (Exceptions are those operations that need to operate with an exclusive lock, such as most forms of ALTER TABLE.)

The minimum useful syntax for pg_dump is:
pg_dump dbname > outfile

However, the backup created from this method has limited usefulness. It can be used to restore a single database in full. A more useful and proper form of PostgreSQL backup syntax looks like this:
pg_dump -U $username --format=c --file=$mydatabase.sqlc $dbname

The options in detail are:
 -U, --username=NAME      connect as specified database user
 -F, --format=c|t|p       output file format (custom, tar, plain text)
 -f, --file=FILENAME      output file name

The most important of which is --format. By default pg_dump uses the plain text format. The plain text format is useful for very small databases with a minimal number of objects but other than that, it should be avoided. The custom format allows for a wealth of customizability. Using the custom format you are able to restore single objects from a backup. For example to restore only a specified index from a backup file:
pg_restore -U $username --dbname=$dbname --index=$indexname

If you wanted to restore only a single function:
pg_restore -U $username --dbname=$dbname --function=$functionname(args)

If you wanted to restore only a single table:
pg_restore -U $username --dbname=$dbname --table=$tablename

For more information on all the pg_dump options, please see the reference page.

Restoring the dump
The command used to restore a backup file is pg_restore. It has similar options to pg_dump. A simple restore:
pg_restore -U$username --dbname=$databasename $filename

Where filename is the name of the backup file.
Do not confuse --file with $filename. The --file option is used to turn a custom format backup into a plain text backup. The value of --file will be used as the output file for that transformation.


If you make the mistake of creating a plain text backup, pg_restore can not be used as a restoration mechanism. You can use psql to restore it:
psql $dbname < $backupfile


Backing up every database
The "postgresql" way of backing up every database is to use the command pg_dumpall. Unfortunately pg_dumpall can only create plain text backups and should be considered deprecated. However it is the only way to backup the globals in your cluster. A reasonable backup strategy to backup your globals and produce a flexible backup of every database in the cluster would look like this:
pg_dumpall -g -U$username --file=$globals.sql; 
psql -AtU postgres -c "SELECT datname FROM pg_database \
                          WHERE NOT datistemplate"| \
while read f; 
   do pg_dump -Upostgres --format=c --file=$f.sqlc $f;
done;

If someone knows of some Windows code that produces a similar result, it would be great if you would share.
Remember, pg_dumpall creates a plain text backup. This means you will need to use psql to restore the globals backup file.


After restoring a backup, make sure you run ANALYZE to update the statistics.

I know this isn't as comprehensive as it could be, but hey, its just a blog.

Categories: Business, OpenSource, PostgreSQL, SQL

Multiple Drupal installations, single login, 10 steps
Posted Thursday Jul 22nd, 2010 08:21pm
by Joshua Drake
| Permalink

We have several Drupal sites, no I am not typing this blog on one. We needed a way to have single sign on with these Drupal sites. One of which is PostgreSQL Conference. There are a few modules out there that can do it, some don't work with PostgreSQL, some are usable but not user friendly (HTTP AUTH) and still others use external services such as OAuth. I didn't want any of these. I wanted a simpler, more flexible solution. I found it with a little PostgreSQL know-how and a modification to the Drupal settings.php file.

The following is ten steps that assume we have three sites. At the end of the steps we will have single login between the three sites..

Step 1: Create users
psql -U postgres;
create user one with encrypted password 'foo';
create user two with encrypted password 'bar'
create user three with encrypted password 'baz';

Step 2: Create database and schemas
create database drupal;
\c drupal -- (assumes the use of psql)
create schema one authorization one;
create schema two authorization two;
create schema three authorization three;

Step 3: Sandbox users
alter user one set search_path = 'one';
alter user two set search_path = 'two';
alter user three set search_path = 'three';

Step 4: Install Drupal
For the sake of brevity I am going to assume you have unpacked three copies of drupal in the same directory. Perhaps /home/www/one, /home/www/two, /home/www/three . At this point you would use your web browser and set up drupal normally. Just assign your users appropriately to each install and set your database to drupal.

Step 5: Turn off caching (for testing)
Go into the Drupal Administration pages and turn off caching for every install.

Step 6: Alter users and sessions location
This will break your installs initially. Don't fret. It does not really matter which one you pick but for consistency we will assume that the drupal install one is the canonical version.
alter table one.users set schema public;
alter table one.sessions set schema public;

Step 7: Fix perms
create role drupal user one,two,three;
alter table users owner to drupal;
alter table sessions owner to drupal;
grant insert,update,delete on users to drupal;
grant insert,update,delete on sessions to drupal;

Step 8: Modify settings.php
Drupal offers the ability to use a single database for multiple installs using an array called db_prefix. Modify the value in each install to:
$db_prefix = array('users' => 'public.',
                 'sessions' => 'public.',);

Step 9: Test
At this point you should be able to login to each site using the user/pass from the one install. To test it further add a new user to any of the installs and see if you can login on a different one.

Step 10: Marvel (Oh and turn back on caching)
That's right, marvel. No obnoxious plugins. Simple overhead. Works even if the installs aren't on the same machine (although you would need to modify pg_hba.conf and possibly postgresql.conf).

Categories: Business, OpenSource, PostgreSQL, SQL

Let the jokes begin! PostgreSQL Conference West has changed locations.
Posted Wednesday Jul 21st, 2010 09:27am
by Joshua Drake
| Permalink

About a week ago I announced PostgreSQL Conference West 2010 CFP. In that CFP I also announced the location. A nice place, the Westin at Union Square in San Francisco. We were excited, the hotel was top knotch.

Then on Monday I received notice, the hotel acquisitions team (EDB) has received an amazing counter offer from a competing hotel.

The hotel is still in San Francisco, it is a four star hotel and the rate is much better for attendees (159.00 vs. 199.00). Here is the catch, which if you are reading on Planet you have to wait until after the jump.... Read more...


Categories: Business, PostgreSQL, SQL

Announcement: Simpycity 0.3.1 Released
Posted Tuesday Jul 20th, 2010 01:13pm
by Aurynn Shaw
| Permalink

Following up on the blog post covering the new coolness in 0.3, and better docs on working with Simpycity, we've just released Simpycity 0.3.1, our best release yet!

Simpycity can be downloaded from our Wiki, and our code is available from the Subversion repository.

Finally, starting today, all new releases of Simpycity are available on the PyPI package index, and Simpycity installable via:
$ easy_install Simpycity


Categories: OpenSource, PostgreSQL, Python

Active Object in Simpycity
Posted Tuesday Jul 20th, 2010 01:04pm
by Aurynn Shaw
| Permalink

Simpycity is, as we've previously covered, a small library that permits for the direct mapping of arbitrary SQL statements to Python callables.

This power allows for the development of complex representations that do not need to directly map to the underlying database representation.

This differs from most conventional ORM technology, which follows the ActiveRecord pattern.

Simpycity was implemented in this way for a great many reasons, first and foremost that the Active Record pattern is not the best representation of your data to your application logic. Should the application need to be aware of underlying data relationships? Should
the application be aware of foreign keys, structures, and other underlying constructs?

Or should the application be able to interact with the data in a form that is logical, and sensible to the application, without needing deep
knowledge of underlying representations?

We thought so, and Simpycity, and a concept more along the lines of Active Object is our result.

Disparate Representations


Simpycity, instead of writing SQL for you via query generators, requires that the developer write SQL by hand.

The reason for this is that database representations are not generalizable into object relations - this disconnect is the entire
reason behind the object-relational difficulties.
A proper object representation encapsulates all the possible information about a method in a single location, as well as all the necessary
methods to act on that data. A single object then represents a single quantum of data.

However, for a relational system, normal form requires that disparate pieces of information are further broken down, into points of absolute
truth about the data. A person's name, for instance, is a point of absolute truth, and should exist in only a single place in the database,
whereas a person's name could exist in several places in an Object system, in a sensible manner.

The disparity comes in that a Person, in terms of business requirements, is rather different from a Person in SQL terms, to the point where it would not be sensible to represent a database Person as an object Person - Address information, birthdate, all sorts of ancillary data that would normally be present isn't, per correct normal form.

Simpycity works to avoid this, by allowing for business models that have little if anything in common with the underlying table structure,
allowing for proper normalization as well as useful business objects.

Forging Anew


As Simpycity does not impose the database structure on your objects, it can't immediately provide the functionalities of .new() in the way a conventional ORM can - even though we've seen Simpycity handle the .save() feature brilliantly.

Instead, if you instance a Simpycity model, not from the database, you get precisely and only a Simpycity instance. As it's not connected to a known set of database data, all the functions and other associated items have no way of operating, and the model just sits there, forlorn and empty.

But since we have to match the Active Record pattern, how would we go about providing .new() in Simpycity?

Here's how we do it:

Given a standard model that looks like this,
class model(ctx.Model()):
     table = ['id','name']
     __load__ = ctx.Function("load_obj", ['id'])


We're able to do simple and basic load operations. Right?

But, to create a new object, the pattern more resembles:

class model(ctx.Model()):
...

new = ctx.Function("new_obj",['name'], return_type=model)


Which allows for the external interface of:

import yourmodel
o = yourmodel.new("Some name")
o.commit()


Providing a clean and sensible model API, following the ActiveRecord pattern,
but still offering all the power of Simpycity.

Twisty Little Properties


Another very nifty capability of ActiveRecord systems is that of reflection, automatically retrieving the far end of a foreign key constraint. This allows for useful functionality like

aModel.comments


correctly reaching across the one-to-many relationship and pulling all the comments.

As Simpycity doesn't directly map tables, capabilities such as this aren't directly implemented in Simpycity.
However, since we do realize that business objects need to perform similar tricks and load data in via properties, we added specific support for this into Simpycity.

But, since Simpycity is entirely callable based, we had to be able to support this feature with our existing metaphors. To that end, we included a simple function that will take any Simpycity callable (or any callable, really), interrogate its argument list, and handle argument mapping as you'd expect.

Using this feature is as simple as:

    from simpycity.helpers import prop
    
    class myTextObject( ctx.Model() ):
        table = ['id', 'value']
        __load__ = ctx.Function("textobject.by_id",['id'])
        comments = prop( get=ctx.Function("textobject.comments", ['id']) )
        
    mto = myText(1)
    comments = mto.comments


Easily allowing for sensible properties to be created, based entirely on clean Simpycity code. Properties created in this way even support set and delete functionality, identical to a standard property, allowing for property accessors to easily manipulate the database layer.

As a note, prop() is a new feature in 0.3.1. 0.3.0 and below should use

    from simpycity.helpers import sprop
    
    class myTextObject( ctx.Model() ):
        ...
        comments = property(sprop( get=ctx.Function("textobject.comments", ['id']) ))


Obviously not as clean, and not as capable. You should upgrade ASAP.

Next


We've stabilized the API, made everything work through the consistent Context interface, and have built a powerful callable-based model infrastructure for all sorts of application development.

So what's next for Simpycity? Well, some of the things we're planning on include breaking the Model object away from psycopg2 dependency, allowing us to use other PG drivers (such as pg8000), as well as opening up the Model protocol we've defined for other contexts - file access, for instance. Anywhere that an application needs to represent a complex underlying structure as a simple object, the Model could be used.

More in the future, we're really looking forward to integrating Simpycity callables with Django and SQLAlchemy model objects, using Simpycity to provide strong, clean functional and raw query support in those environments. And vice-versa as well: Binding a SQLAlchemy or Django ORM chain to a Simpycity object, using it to populate an object, and building even more complex, effective business objects for your application.

Even farther afield, we've been looking at integrating query generation to Simpycity. There's a lot of boilerplate SQL that needs writing, and being able to hand it to an elegant, PostgreSQL-focussed abstraction would be, we think, ideal.

As always, Simpycity is available on our Wiki, and our code can always be checked out from our Subversion repository.

Finally, Simpycity is easily installed from the PyPI index, using easy_install Simpycity.

Categories: OpenSource, PostgreSQL, Python

PgWest 2010: Call for Papers
Posted Wednesday Jul 14th, 2010 12:28pm
by Joshua Drake
| Permalink

PostgreSQL Conference West, The PostgreSQL Conference for Decision Makers, End Users and Developers, is being held at the St. Francis, Westin Hotel in San Francisco from November 2nd through 4th 2010. Submit your talk.

Time line:

July 14th: Talk submission opens
Sept 5th: Talk submission closes
Sept 10th: Speaker notification

This year we will be continuing our trend of covering the entire PostgreSQL ecosystem. We would like to see talks and tutorials on the following topics:
  • General PostgreSQL:
    • Administration
    • Performance
    • High Availability
    • Migration
    • GIS
    • Integration
    • Solutions and White Papers

  • The Stack:

    • Python/Django/Pylons/TurboGears/Custom
    • Perl5/Catalyst/Bricolage
    • Ruby/Rails
    • Java (PLJava would be great)/Groovy/Grails
    • Operating System optimization (Linux/FBSD/Solaris/Windows)
    • Solutions and White Papers


    Submit your talk.


Cool Features I'm Looking Forward to in PostgreSQL 9.0
Posted Tuesday Jul 13th, 2010 11:46am
by Aurynn Shaw
| Permalink

Recently, I was able to attend the local PostgreSQL community meeting here in Portland, and the topic du jour was covering the nifty and interesting features that are found in PG 9.0.
Confessing that I haven't really been paying close attention to what's new in 9.0, the talk was incredibly interesting - covering a range of new features in 9.0.

The ones I'm really excited about are:

/contrib/passwordcheck

This newly-added contrib module adds that feature that sysadmins are going to love - being able to add stronger password-based restrictions to the default PG password requirements.

You know the sort I mean - must be a mixture of capitalized and non, must be at least one number, must be at least n digits long. This is exceptionally useful if you're in an environment with strong password requirements.

samehost and samenet

These pg_hba.conf administration conveniences allow you to match any IP address that the server owns (same host) or any subnet that the server is a part of (same net).
This'll be very useful for any multi-homed DB server that needs to listen on all its interfaces; as well as providing an easy way to just say "all my peers" without explicitly needing to remember and declare the IP subnet.

Parameter change logging

This is a fairly useful change for anyone who's tried to figure out what changed during the last reload of PG. Instead of wondering, you can now configure a logging setting to report what parameters from the config were changed, what they were changed to, and what changes require a full restart before they can be used.

EXPLAIN now machine-readable

This is an amazingly useful feature, especially when combined with auto_explain - you'll now be able to get explain output in XML, JSON or YAML, allowing for easy parsing, manipulation and comparison of the EXPLAIN output.
This also allows for transformation of the output - easily converting it to HTML, for instance. Easy comparison before and after an index is created, simplifying programmatic discovery of indexes.

WHEN clause in triggers

WHEN clauses in triggers add a new level of power to the trigger system, by allowing PG to handle the large block of IF statements we always have in our triggers.
This should allow us to see a major performance boost for any conditional triggers, since PG doesn't have to descend to the trigger stored procedure for every modification.

NOTIFY now takes a string argument

Instead of just getting the notification from your database, you'll now be able to pass an optional string. With a clever sproc and some clever string formatting, you can now pass useful information to the NOTIFY receiver, saving yourself from excess work in figuring out what's changed.
Simpler event-driven programming? Never a bad thing.

Named parameters in stored procedures

This is huge for Simpycity. On 9.0, we can now handle argument naming in a whole new way:
SELECT * FROM my_func(param='string', another_param=3);


Being able to run queries like this will easily simplify Simpycity - just name your arguments the same in both Python and PostgreSQL, and everything will Just Work, clean and simple.

Keep an eye out for this update - I can't wait to add support in Simpcyity. Keep an eye out for the updates on the Simpycity project page

So Much More!

Of course, this is just a subset of the interesting things - the bits I'm really looking forward to.
You should go have a look at the whole list, easily found on the Illustrated Postgres 9.0 Wiki.

Categories: OpenSource, PostgreSQL

PostgreSQL High Availability options
Posted Monday Jul 12th, 2010 02:25pm
by Joshua Drake
| Permalink

PostgreSQL is widely accepted as the most scalable and stable Open Source database in the industry. It is also known to hold its own against any of the proprietary databases as well. There are a plethora of High Availability options available for every workload and business requirement. Below is a brief listing of the common High Availability options for PostgreSQL. This is by no means an exhaustive list but it does provide some starting points. (and before anyone yelps, 9.0 isn't out yet)

Log Shipping:
Read more...

Categories: Business, OpenSource, PostgreSQL


Copyright © 2000-2010 Command Prompt, Inc. All Rights Reserved. All trademarks property of their respective owners.