Mammoth
  • |
  • Contact
  • |
CMD | Command Prompt, Inc. - PostgreSQL Solutions, Support & Hosting
  • |
  • |
  • |
  • |
  • |
Kicking the Donkey of PostgreSQL Replication
Posted Tuesday Feb 4th, 2014 12:50pm
by Joshua Drake
| Permalink

This is the title of a talk I am developing for the matured PostgreSQL Conference:
PGConf NYC 2014 . Formerly a PgDay, this is now a full blown conference extending
two days with three tracks. From all reports it is set to be the largest
PostgreSQL Conference ever in the United States, surpassing even the old West and
East series (which no conference in the U.S. has done to date). It is truly exciting
times for our community.


This talk will be a departure from my standby talks of PostgreSQL Performance and
Choosing the right hardware. Katz asked me, "to bring your full East Coast from the West Coast personality.". I plan on doing so. So cinch up the boot straps it is going to be a ride. In classic JD style I am going to be blunt and to the point about the good, the bad, and the, "WTH were they thinking" of PostgreSQL
Replication.


So outside of personality what am I really trying to deliver to the community? I
think the description of the talk says it all:



  • Have you ever wondered how to configure PostgreSQL Replication?

  • Have you ever wondered how to configure PostgreSQL Log Shipping?

  • Have you ever wondered: Which one is best for my application?

  • Are you aware of the pitfalls of Replication? Where it breaks? When it will
    act in a way that is counter-intuitive? Do you know how to fix it?

  • Do you know how to monitor it?


  • If you have asked yourself any of these questions, this is the talk for you. We
    will step through PostgreSQL replication, the technology involved, the
    configuration parameters and how to configure it in a manner that isn't fragile.
    We will also cover gotcha's how to prepare for them and understanding what replication
    is doing.


    This talk is not not for the faint of heart. I will take a no holds barred approach
    and give you the real deal, the dirt and the gold that is one of the most sought
    after PostgreSQL features.


    On a closing note, this conference is showing how a PostgreSQL User Group, combined with the resources of United States PostgreSQL can help grow and educate the community. They don't just help NYCPUG, they also help PDXPUG, PHILLYPUG and SEAPUG. Maybe your PUG should consider working with PgUS? If so, give "Jonathan S. Katz" [jonathan.katz {@} excoventures.com] a jingle.


    Categories: Business, OpenSource, PostgreSQL, Python, SQL

    PgConf.eu is over, it was a blast but I am curious about the future
    Posted Wednesday Nov 6th, 2013 12:09pm
    by Joshua Drake
    | Permalink

    First let me say that I attended pgConf.eu like I attend every conference (that I am not running). I show up for a few hours on the first day, then I come back and attend my talk. I don't take travel lightly and as much as I bromance my fellow elephant bretheren, I want to explore the sights and this was freaking Ireland people.


    I had an odd feeling for the time I was there. The community was in full force, there was at least 240 people there and that was great. It was the commerce side, the sponsor side, the **money** side that was lacking. EnterpriseDB, Cybertec and 2ndQuadrant were there with booths but I wonder if it was even worth it?


    It is great that 3 of the largest European commercial contributors were sponsors. I think the traditional model of sponsorship is over and those companies are not getting their investment back. This is certainly not the fault of PgConf.eu nor is it the fault of the sponsors. It is habit to fall into what we know. It is that what we know, is broken. There is no true return on investment. You might pull a couple or even a half a dozen leads. Yes you get 20 minutes to give your speech at the end of the conference (when a lot of the people have left anyway) but business is about relationships. I don't see how 3 booths at a 240 person conference enables relationships to be built.


    I wonder what the attendees are getting out of it? Do we see these sponsors as more beneficial than companies (such as Credativ) who wasn't sponsoring with a booth but was speaking? Or are we just silently thankful that they are there because that way we don't have to spend 500 Euro to attend the conference? If the latter, does it make sense for companies to continue to contribute? What could a company do for the community (and themselves) with a 20k sponsorship versus having a booth and their name on a shirt?


    Which brings me to my final point. If pgConf.eu (or Pgcon, NYCPgConf) were to charge 500.00, and it was sponsor free (or at least more creative in presence) would you still go? What if that included a half day tutorial? What if the conference was only about PostgreSQL + Community. Yes, sponsors are part of the community and they are more than welcome to show up in matching shirts and speak when their talks are accepted but no booths, no littered t-shirts, just good old fashion community, and education.


    What do you think?


    Categories: Business, OpenSource, PostgreSQL, SQL

    5 Things a Non-Geek Girl Learned from Playing with Geeks at CMD
    Posted Wednesday Nov 6th, 2013 12:09pm
    by Angela Roberson
    | Permalink

    When I began at Command Prompt, Java was coffee, Python was a snake, and a Ruby was best used on the color of glittery slippers. If you would have asked me two and a half years ago what "PostgreSQL" does, I would have asked you what language you were speaking.


    A year later, I took my first sales call with out Joshua Drake (jd, @linuxhiker). I was shaking in my boots and it was inevitable that I was going to be sick. Then something happened as soon as I heard the customer say, "Hello".


    I understood what the customer needed and most importantly, I knew we could do it. I was able to say confidently and with out doubt, "Yes, we can take care of you."


    I still have a LONG way to go and I will never be a geek. But here are some key things a non-geek girl has learned from playing with geeks:


    1. It's a big, technical world out there. It is vast and fast-paced. To keep up, you have to continue to instigate fresh ideas and put the time in to develop them. You also have to learn from other people's ideas. The PostgreSQL world is about more than consulting. It's a community of people passionate about the dynamic capabilities of PostgreSQL.


    2. Document down to your underwear. We are passionate about documentation at CMD and you should be too. Only good things can come from documenting work and as Open Source advocates, we promote sharing the Postgres love.


    3. Keeping your software, hardware, and underwear (sorry, it rhymed)updated is key in keeping up the health of your systems. If you don't, you will have problems at some point down the road and you will kick yourself for not making the upgrades sooner.


    4. Our goal is for people to be successful with PostgreSQL. In order to help people do that, it sometimes takes asking "stupid" questions. It will be worth your technical ego if it saves you time, money, and your sanity in the future.


    5. Take risks! Customers sometimes get nervous when we tell them what they are currently using just isn't going to work. The unknown can be unnerving but we are here to make things run as smoothly as possible. Take the risk and trust us!


    So there it is, folks. You Postgres guys (and gals) out there do some really cool stuff. You make the technical world go round and most non-technical people don't even have a base understanding of your brilliance. I'm fortunate to have the CMD team to answer my "stupid" questions with patience and kindness. This is an adventure!


    Angela


    Angela Roberson has been with Command Prompt for two and a half years. She is currently the CMD Team Manager and can help you with all of your non-geek customer needs.



    Just back from NYCPug August, on to more talks
    Posted Tuesday Sep 10th, 2013 01:36pm
    by Joshua Drake
    | Permalink

    In August I spoke at NYCPUG on Dumb Simple PostgreSQL Performance. The talk was well accepted and there was about 60 people in attendance. I have always enjoyed my trips to NYC but this is the first time I have taken a leisurely look at the city. I found myself enjoying a water front walk from 42nd, through the Highline, to Battery Park, all the way to the Brooklyn Bridge and over to Brooklyn to a great pub for dinner. What I enjoyed most about the walk outside of the 10 miles was the community that was present. I think it is easy to get jaded by "midtown" and all that is the tourist in that area. The hustle and bustle, the pollution and dirt of a city. The waterfront walk however reminded me very much of Seattle, there was green and water everywhere, very little litter, lots of families and bike riders. All around a wonderful day and it made me look forward to coming back to NYC again in March for the NYC PostgreSQL Conference.


    Alas my travel schedule is just beginning. I will be speaking at the Seattle PostgreSQL User Group on Oct 1st. I will be giving the same talk as I did in NYC as it has been updated and always seems to be well received. If you are in town you should come join us:



    1100 Eastlake Ave E, Seattle, WA 98102


    I always enjoy Seattle and will also be making a side trip to Bellingham as it looks like I will be moving there soon. It will be sad to say good bye to the Columbia River Gorge but I am looking forward to the Mount Baker area as well as hopefully starting Vancouver, B.C. and Whatcom county PostgreSQL user groups.


    Just three weeks after Seattle, I will be cross the pond to wonderful Ireland where it is bound to be cold, dark and wet. That's alright though as I will plan on staying from Oct 26th - Nov 3rd, allowing for a full trip of sight seeing. Of course I will be dropping by on Friday the 1st to speak on PostgreSQL Backups which reminds me that I need to update the talk for the new pg_dump features found in 9.3.



    Compiling and installing OpenSRF 2.2 on Centos 5.9
    Posted Friday Jul 26th, 2013 10:39am
    by Joshua Drake
    | Permalink

    We do quite a bit of work for King County Library systems. The library system has 45 branches and runs the Open Source Evergreen ILS. One of the very smart things that the Evergreen project decided was that their database of choice would be PostgreSQL. One of the things that the Evergreen project is not good at is supporting LTS releases of Linux and therefore certain things can be a chore. For example, by default OpenSRF 2.2 which is the current stable OpenSRF release can not be installed via RPM or compiled from source by default on CentOS 5.9.


    When discussing with the community about CentOS, the response was the classic responses of, "just upgrade", "move to Fedora", "it isn't that hard to migrate to Debian" showing a clear misunderstanding of what it takes to support infrastructure on this scale. I digress.


    So what to do next? Well, get your hands dirty of course.


    Use yum to (make sure you have EPEL):
        * Install gcc44 gcc44-c++ libevent-devel httpd-devel
    
    ** Note: make sure you only have the 64bit versions installed. 
    ** Yum will sometimes install both i386 and x86_64.
    
    Download compile and install: memcache 1.4.5
    
        * wget http://memcached.googlecode.com/files/memcached-1.4.15.tar.gz
        * CC=/usr/bin/gcc44 CXX=/usr/bin/g++44 
          ./configure --prefix=/usr/local/memcache/
        * make install
    
    Download compile and install: libmemcache 1.0.7
    
    ** Note: 1.0.5 may also work as the requirement is libmemcache library version 0.8.0
    
        * wget \
          https://launchpad.net/libmemcached/1.0/1.0.7/+download/libmemcached-1.0.7.tar.gz
        * CC=/usr/bin/gcc44 CXX=/usr/bin/g++44 ./configure \
          --prefix=/usr/local/libmemcache/ \ 
          --with-memcached=/usr/local/memcache/bin/memcached
        * make install
    
    Add the new libmemcache to ld.so.conf (or create a /etc/ld.so.conf.d file):
        * /usr/local/libmemcache/lib
        * ldconfig -vv|grep mem (you should see something like this:)
             libmemcached.so.3 -> libmemcached.so.3.0.0
             libmemcachedprotocol.so.0 -> libmemcachedprotocol.so.0.0.0
             libmemcachedutil.so.0 -> libmemcachedutil.so.0.0.0
    /usr/local/libmemcache/lib:
             libmemcachedutil.so.2 -> libmemcachedutil.so.2.0.0
             libmemcached.so.10 -> libmemcached.so.10.0.0
             libmemcachedprotocol.so.0 -> libmemcachedprotocol.so.0.0.0
             libmemcached.so.2 -> libmemcached.so.2.0.0
             libmemcachedutil.so.0 -> libmemcachedutil.so.0.0.0
             libmemcached.so.2 -> libmemcached.so.2.0.0
             libmemcachedutil.so.0 -> libmemcachedutil.so.0.0.0
    
    The .so.3 is a dependency from the installed RPM. This should not cause any 
    problems as long as you see the /usr/local/libmemcache/lib line.
    
    Download compile and install: opensrf 2.2.0
        * cd (where you unpacked opensrf)/src/extras
            * make -f Makefile.install fedora 
              # this will install any outstanding perl packages etc
        * cd (where you unpacked opensrf); CC=/usr/bin/gcc44  \
            CXX=/usr/bin/g++44 CFLAGS="-I/usr/local/libmemcache/include" \
            LDFLAGS="-L/usr/local/libmemcache/lib" \ 
            memcached_CFLAGS=/usr/local/libmemcache/include \
            memcached_LIBS=/usr/local/libmemcache/lib LIBS=-lmemcached \
            ./configure --prefix=/usr/local/openils --with-gnu-ld 
        * make install
    
    cd /
    ln -sf /usr/local/openils .
    


    At some point I might turn this into some rpm packages but right now, we solved the problem.


    Calling Bullsh*t in Open Source communities
    Posted Tuesday Jul 16th, 2013 12:44pm
    by Joshua Drake
    | Permalink

    We are all human. We all lose our temper. We all have our moments of, "I really wish I could take that back". Of course not if you are not Linus Torvalds. Now everyone knows that Linus has a temper, that he is a foul mouth, lacks certain social graces and is generally one of the, if not the most important developers to surface in the last 20 years. Does that mean he gets to be a jerk? In his mind, yes.


    In some ways I agree with Linus. If you are a donkey butt and you don't pay attention to your community and follow its guidelines, then just leave. We don't have time for you anyway. We are trying to solve problems here. In other ways I don't agree with Linus because he exhibits a distinct lack of grace and grace will always get you further than angst. I have been taught this over and over through the years. Sometimes I get it right, sometimes I call out half the Postgres hackers in a room of 200+ and upset lots of itty bitty sensitive egos. So grace it is and grace I try to exhibit although sometimes fail.


    Linus could learn from this. Although he sits upon a high perch, he is greatly flawed in his thinking and there are some developers that are calling him on it.


    http://thread.gmane.org/gmane.linux.kernel.stable/58049/focus=1525074


    Linus excuses his behavior by stating it is just who he is:


    http://thread.gmane.org/gmane.linux.kernel.stable/58049/focus=1525074


    I once said that the days of Cowboy Open Source is over. Linus apparently didn't see that talk. People are going to tire of the, "I am a socially inept geek and therefore have an excuse to be a jackass excuse". In fact I would argue that most have already tired of it.


    Linus I would suggest you look at the -hackers community at Postgresql.org. There you will find a wide disparate group of hackers from dozens of different cultures who all manage to get the job done without juvenile threats, and foul language. They almost are always able to do it professionally too. Have a nice day.




    postgres_fdw for 9.2
    Posted Wednesday Jun 19th, 2013 12:51pm
    by Joshua Drake
    | Permalink

    We have backported the postgres_fdw to 9.2. It is read only of course as the infrastructure for writes is not in 9.2 but it is usable. Enjoy it!


  • Postgres-FDW


  • The steaming pile that is Precise with kernel 3.2
    Posted Thursday Jun 6th, 2013 12:03pm
    by Joshua Drake
    | Permalink

    I don't know if it is a mainline kernel problem but I can tell you that on
    Ubuntu Precise, Linux kernel 3.2 is a disaster for PostgreSQL. I am not even
    going to go into a huge rant about it. I am just posting the numbers. See for
    yourself. There should be a public service announcement about it.


    before upgrade to 3.9


    08:35:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
    08:45:01 AM     all     30.91      0.00      5.66     40.05      0.00     23.38
    08:55:02 AM     all     29.32      0.00      5.10     39.66      0.00     25.92
    09:05:02 AM     all     31.71      0.00      6.24     40.99      0.00     21.06
    09:15:01 AM     all     32.45      0.00      6.59     46.74      0.00     14.21
    09:25:01 AM     all     20.62      0.00      5.39     60.00      0.00     14.00
    09:35:01 AM     all     31.03      0.00      3.61     33.95      0.00     31.41
    09:45:01 AM     all     36.54      0.00      3.22     34.13      0.00     26.11
    09:55:02 AM     all     40.17      0.00      3.66     30.98      0.00     25.19
    10:05:01 AM     all     33.49      0.00      3.04     32.28      0.00     31.19
    10:15:01 AM     all     48.63      0.00      2.87     25.50      0.00     23.00
    10:25:01 AM     all     51.34      0.00      3.56     26.06      0.00     19.04
    10:35:01 AM     all     39.41      0.00      3.44     29.86      0.00     27.29
    10:45:02 AM     all     36.07      0.00      8.79     30.94      0.00     24.20
    10:55:03 AM     all     38.04      0.00      7.98     32.98      0.00     21.01
    11:05:11 AM     all     39.25      0.00      8.81     36.75      0.00     15.19
    11:15:02 AM     all     35.19      0.00      8.76     41.98      0.00     14.07
    11:25:03 AM     all     38.21      0.00      9.65     38.86      0.00     13.28
    11:35:02 AM     all     42.92      0.00     11.66     34.28      0.00     11.14
    11:45:02 AM     all     39.40      0.00      9.96     39.03      0.00     11.61
    11:55:01 AM     all     28.72      0.00      3.27     36.32      0.00     31.69
    


    after upgrade to 3.9


                    CPU     %user     %nice   %system   %iowait    %steal     %idle
    08:35:02 AM     all     40.08      0.00      4.46     10.66      0.00     44.80
    08:45:01 AM     all     38.80      0.00      3.94      7.96      0.00     49.29
    08:55:01 AM     all     31.48      0.00      3.03      2.58      0.00     62.91
    09:05:01 AM     all     32.18      0.00      3.09      3.86      0.00     60.87
    09:15:01 AM     all     26.71      0.00      2.39      3.52      0.00     67.39
    09:25:01 AM     all     30.49      0.00      3.10      2.80      0.00     63.61
    09:35:01 AM     all     32.50      0.00      3.49      3.42      0.00     60.60
    09:45:01 AM     all     36.76      0.00      3.85      6.39      0.00     53.01
    09:55:01 AM     all     44.45      0.00      4.63      9.23      0.00     41.69
    10:05:02 AM     all     38.39      0.00      4.28      8.60      0.00     48.72
    10:15:01 AM     all     33.57      0.00      3.53      4.10      0.00     58.80
    10:25:01 AM     all     29.42      0.00      2.96      3.16      0.00     64.45
    10:35:01 AM     all     32.90      0.00      3.37      5.33      0.00     58.40
    10:45:01 AM     all     34.56      0.00      3.62      4.32      0.00     57.50
    10:55:01 AM     all     34.84      0.00      3.37      4.27      0.00     57.52
    11:05:02 AM     all     38.30      0.00      4.05      7.56      0.00     50.08
    11:15:01 AM     all     36.80      0.00      3.54      9.51      0.00     50.16
    11:25:01 AM     all     34.79      0.00      3.82      8.17      0.00     53.21
    11:35:01 AM     all     32.68      0.00      3.07      4.97      0.00     59.28
    11:45:02 AM     all     31.77      0.00      3.45      6.07      0.00     58.72
    11:55:01 AM    all     33.58      0.00      3.92      6.39      0.00     56.10
    



    Returning multiple results without a round trip
    Posted Friday May 31st, 2013 10:58am
    by Joshua Drake
    | Permalink

    My blog on changes to the wire protocol [1] prompted this question from a reader:



    "Would it be necessary to modify the wire protocol to support multiple query/result-set combinations per server round-trip? That is, to be able to send a hundred different queries (each with a different number and type of columns in the result set) and receive a hundred different results all in a single network round-trip? That is a feature supported by some other databases that can have an order-of-magnitude effect on the performance of high-latency installations (satellite, etc.)."


    I did a little research into this and it seems that we can already do this, sort
    of. See the following:


    postgres=# select 1; select 'two'; select 'three'; select 4;
     ?column? 
    ----------
            1
    (1 row)
    
     ?column? 
    ----------
     two
    (1 row)
    
     ?column? 
    ----------
     three
    (1 row)
    
     ?column? 
    ----------
            4
    (1 row)
    


    This will avoid the round trip. However there is a limitation. If these were
    tables, the server would only send the metadata of the last SELECT. The metadata
    being things such as table description, and number of rows returned. So if you
    return data from 4 tables, you are only going to know what to do with the last
    of the tables returned.


    So it appears that the wire protocol does indeed support the ability to send
    multiple sets back but the server / libpq does not support sending multiple
    sets with isolated metadata.


    The benefit of supporting this could be huge. Consider in a web word you can
    often end up calling dozens if not hundreds of queries for a single page draw.
    A network round trip on its own is not expensive but if you add up 30 or 200 of
    them, the performance degradation adds up quickly. This is especially true when
    you consider that they each have their own overhead with the server processing
    the query.


    1. Modifying the backend protocol for 9.4/10.0



    Modifying the backend protocol for 9.4/10.0.
    Posted Wednesday May 29th, 2013 01:17pm
    by Joshua Drake
    | Permalink

    A recent discussion on the lists about potentially incompatible changes
    to 9.4/10.0 of PostgreSQL the idea of things we wanted to do to the wire
    protocol in upcoming releases.


    The wire protocol is the language spoken between a client and the server of
    postgresql. The majority of programming languages out there do not implement
    their own version of the protocol instead opting to bind to the C library libpq.
    There are notable exceptions, specifically C# and Java both of which implement
    native versions of the wire protocol for their respective languages.


    The current wire protcol is v4 and was developed for 7.4 of PostgreSQL. That was
    released in 2003. I think we can safely say there is a lot of new thinking and
    things we want to change in the protocol to bring it up to newer thoughts on
    performance and features. A list of which can be found here:


    https://wiki.postgresql.org/wiki/Todo#Wire_Protocol_Changes


    One of them is particularly entertaining: Use Commpression. We had a
    patch for using compression on the wire protocol many years ago and we were
    essentially told there wasn't a use case. Funny how things change over time.
    That is the way of the community (replication now in core, cough cough) and
    we love it for it. The community is relatively conservative and because of that
    we have arguably the most stable and feature rich RDMS of it.


    An item I suggested was the ability for the wire protocol to understand the type
    of connection that was being requested. In short I want the ability to tell
    a connection that it is a read-write, read-only or write-only connection. In
    practice a read-only connection would work something like this:


    conn-r = psycopg2.connect(database='testdb', user='test', conn-type='r')
    


    I think this would be particularly useful to a tool such as pgPool-II. PgPool-II
    is a sophisticated software that supports replication, load balancing and query
    caching. Although we only use it for load balancing, which it is very good at.


    The problem is, it is relatively slow. It is slow because it has to parse each
    query that is executed to determine what to do with it. If it is a SELECT only
    then it can be load balanced. Again, it works well but if pgPool-II knew from
    the get go the type of connection it was dealing with because the protocol told
    it on negotiation then it wouldn't need to parse the query because the query
    was called on the conn-type='r' handle.


    I could see it also being useful in the future if we were to get load balancing
    into core. The connection receiving postmaster would see that it is a read-only
    connection and automatically pass it a replication slave.


    To see more of what is being discussed you can review the thread.




    Copyright © 2000-2014 Command Prompt, Inc. All Rights Reserved. All trademarks property of their respective owners.