Tuesday, July 26, 2011

persona versus real name - twitter / google+

The 'individual' version of Google+ is in 'beta' right now.

Once the individual version is launched, there will be work on the 'corporate' version of Google+ designed for trading entities.

As we are talking about a beta product and only speculating about what might come from the future business version, all of this might change.


Real name versus Persona - Twitter / Google+

There are strong arguments both ways.

Here I give my input...

There is one company that has registered 500 or so twitter accounts for it's corporate marketing.

If this real name thing helps reduce the 'persona' style stuff, which often just a marketing tool used by one or two dumb companies overdoing the marketing, then, for me, that is a positive.

It is one of the things that put me off twitter a bit, hundreds of corporate accounts from the same company, all spewing the latest marketing hype.

Not saying I agree entirely with the Google approach (it does have some drawbacks - some of which you have mentioned), or that I disagree with your sentiment ... just presenting some input.

Twitter is there for those folks who feel better about wearing a persona ... like being at a masked ball, everyone else is allowed to wear a mask also.

There are lots of drawbacks to the 'real name' approach, and the blog of Alison Wheeler is a good source of counter argument.

I leave it up to you to form your own arguments and decide if you want to be on Twitter, Google+, both, or neither :)


Notes and Further Reading:

There are a whole bunch of issues around gender, that may come up in the debate about Google+ and real names.
Note: As of July 12th 2011, Google has made gender optional
Your gender is no longer required on your profile: Some of you were concerned about having to share your gender with people you don't know. We hear you, and we've now made displaying your gender on your profile optional.

    Sunday, July 24, 2011

    internet holiday companies - hand to mouth - atol / abta

    Having a look at some of the fantastic deals for holidays in Egypt at the moment.

    Are the offers 'too good to be true'?

    Maybe.

    The global recession and the increase in airline fuel and flight taxes, has meant fewer people travelling abroad.

    The same number of package holiday companies are chasing a shrinking pool of customers.

    The main questions that you should ask of any 'internet only' travel company:
    • Which charter company will be providing the flight?
    • Do you regularly rearrange flights within a month of departure date?
    That second question is one which you will not ask the travel company directly. Do a bit of research and find out if, last year, folks had their travel dates moved, or flight numbers changed.

    ( Last minute rearrangements of charter, could be an indicator for problems meeting payment terms with Charter companies. )


    Cash flow and travel companies living 'hand to mouth':

    Last year it was the turn of Goldtrail. This year who knows.

    Goldtrail was not a new venture, they had been in business since the late 90s!

    When there are too many operators vying for that shrinking pool of business, the temptation is there to keep cutting costs, and offering even more competitive pricing.

    So much so that the entire business can be put at risk!

    It might sound appealing to be a customer 'chased' by these increasingly good offers, however there is another side to things.

    Cutting all of the profit from the deal, can leave some of these travel companies with cash flow issues.

    Did they already pay the charter company? Or is some of my flight cost just patching up cash flow issue?

    It is entirely possible that companies with this sort of cash flow issue, might be living hand to mouth, and paying your charter flight from expected future business.


    Atol / CAA / Abta protection - reality and myth:

    In the UK the Civil Aviation Authority (CAA) / Atol might offer some protection for your flight.

    If you book an Internet holiday and the company is using a UK charter company, then you might have some protection as hinted above, however this will be in the form of an 'after the fact' remedy.

    In practice your name will be added to a list of customers suffering lost due to unpaid charters / lost flights. Months later, if successful, you might get some or all of the cost of the flight back.

    Read up on Abta or Aito protection. It may offer some protection for the entire package, however that would probably only apply if the package was booked in the UK through a bricks and mortar travel agent (perhaps high street / retail park / town centre)


    Notes and Further Reading:

    Extract from the February 2006 article (link below)
    ABTA has announced it will 'cease direct payments to consumers in respect of failure of member agents conducting retail activities when nothing has been booked by travel agents.'
    Websearch the phrase "Package Travel, Package Holidays and Package Tour Regulations 1992" for some details, of the mandated regulations, regarding financial protection and package holidays.

    In the UK Aito do offer some protection if your package is booked through an independent travel agent. See their website here.

    Saturday, July 23, 2011

    knee jerk - the removal of freedom and civil liberty - Utøya island

    Having visited Norway several times, I find it a refreshing experience.

    One of the things that struck me when I visited Oslo, was the relaxed and untroubled nature of the locals.

    The openness and integration of the parliament into the main areas of Oslo seemed unusual, compared to the segregation and visible police guard (in numbers), of say London.

    Knee Jerk reactions are inevitable, however that does not necessarily mean that these reactions make it into law, or are allowed to alter that relaxed atmosphere irrevocably.

    To the politicians of Norway and the Mothers and Fathers of those victims of Utøya island, an important thing in the coming months might seem to 'prevent this happening again'.

    To the people of Norway, do not allow this tragedy to be used as an excuse for clamping down on civil liberties. The loss of personal freedoms as a result of this outrage would be, some might feel, a greater loss.

    There is a whole industry around 'security', from the people that manufacture the CCTV, to the smartcard industry, to the record check bureaus, to the paid security services that could be deployed by mandate.

    That very industry will be working, and lobbying hard, over the coming months to have a greater role in the lives of ordinary Norwegians.

    The hardest question that anyone can ask is...
    Could this really have been prevented?
    The answer, which nobody with a financial interest in the security industry would actually say out loud is "Probably not".

    It will take a long time for the healing process in Norway to be complete. I hope that Oslo and Bergen retain their excellent reputation as great places to enjoy a city break.

    Friday, July 8, 2011

    us budget negotiations - end of shuttle program

    In the rush to devote headlines to Ireland and Greece, commentators overlook a greater threat - US default.

    Here is the scariest graph I have seen in a long time:


    Source: Bbc 7th July 2011


    First scary portion - history Bush and Clinton

    Look at the portion 2001 to 2002 - that downward slope on the blue line shows a drastic fall in revenue - either an economy shrinking or a tax cutting agenda.

    Clinton left office January 20, 2001 however for much of fiscal 2001, the tax policies put in place in 2000 would have been in effect.

    Do a websearch for Clinton+Greenspan, and form your own opinion as to what happened there.

    I will not even comment in detail on 2001->2008 and Bush - look at the Blue revenue line versus the amount being spent - frightening.


    Second scary portion - needs revenue climb from 14% to 20% of GDP

    Sales, Sales, Sales.

    Exports, Exports, Exports

    Those two sentences are, perhaps, the preferred way of increasing revenue - sell more outside of your own country.

    I do not doubt the need to increase taxes drastically and some of that activity has already happened.

    If this makes uncomfortable reading, then go back and look at the graph again - the gap between the lines must be narrowed, and narrowed quickly!

    The US economy needs to grow (internally or externally) by 6% in two years, a staggering challenge.*

    Are Obama and Bernanke the team to do it? I hope so.

    *Germany just managed 3% growth, however they are the outlier.


    The end of the Space Shuttle program:

    Most folks with an interest in Science, will probably feel a tinge of regret when the Space Shuttle program at Nasa closes. Hey, personally, I love the Shuttle.

    Go back and look at the brown line in that graph. However it is achieved, spending needs to come down.

    In the UK we are having to make difficult choices also, it is not easy. We recognise the need for cuts, but feel a natural hesitation when a final decision is made to cut some funding area.



    Debt as % of GDP - surely that is the measure of 'financial strength'?

    Speak to a bank about personal financial difficulties and they will ask about:
    1. How much you spend per month, versus how much you earn per month
    2. Your total debt versus your annual income.
    The graph I referred to earlier is (1) and Debt as a % of GDP is closer to (2)

    The UK has a truly shocking picture when viewed under the (2) microscope - the UK as a nation in 2011 owed 82% of GDP (Up from 75% in 2010)

    Is America worse?


    Well here is the thing - I am not going to argue the case, however taking what is known for sure - a GDP figure for the US of 15 trillion is about right.


    If US debt stood at 5 trillion then that is 30% of GDP as debt (very roughly)


    However official IMF figures have US listed as 99% of GDP as debt.

    That figure fits broadly with the picture painted on money.cnn.com

    So where is the distinction - who can say 5 trillion or 15 trillion?

    A figure of $4.5 trillion is doing the rounds in news articles at the moment, inspired by the narrower definition 'US bonds owned overseas'.

    In the rush to talk about 'owned overseas' some of those articles are popularising a figure that represents less that half of total debt.


    The UK situation 75% of GDP - how shocking?

    Just so. Looking at the top 50 'financial offenders' :) we see that 82% buys the UK a place in the top 30. The UK ranks 22nd between France and Canada in the list.


    Source: Scrape from imf.org figures for 2011 (see link at the end of this article)

    How did the UK get into that position? The financial sector, as a proportion of all business in the UK is very significant.

    When the banks failed, the bailout was big.

    How did the UK banks fail, was it property speculation like Ireland, or subprime like America?

    There was in the UK an over-reliance on 'the housing market' to drive growth, it wasn't the overheating that killed it (like Ireland), but the lack of new entrants due to low growth elsewhere in industry.

    The UK did not have the subprime mortgage situation of the US, however personal loans and credit cards to 'subprime' candidates, had been rife for a decade.

    Here are some questions about the interplay between countries that might twist your noggin :)
    • In borrowing the US a trillion dollars, did China provide the rope with which it would hang itself?
    • If China had instead stashed that trillion in the worlds biggest mattress, would there have even been a financial crisis?
    • When MBNA arrived in the UK 8->10 years ago, offering low interest credit, to a wider range of folks, was that the start of UK personal 'subprime'?
    I mentioned China twice in those points, so let me answer that second point, to show this is not an attempt to blame China.

    Showing a fool your money and offering to borrow it to them is not a crime.*
    Lack of oversight and appetite for greed, will fuel the sort of activities that existed in many countries prior to the recognition of a crisis.

    *Part of the solution this time around is to make it socially unacceptable, for banks to lend to 'high risk' individuals.

    The reason it was a 'Global financial crisis' was in part due to the interconnectedness of capital borrowing, however it was Global also, because so many countries were 'at it'.

    An appetite for risk with governments afraid to allow institutions to fail, it could be the stuff of fiction. Unfortunately not.

    It was 70 years from the Great Depression to the Global Financial Crisis.

    It will happen again, but the big question is whether it will be in 2070 or have human memories got shorter or longer than this time around.

    In 2070 perhaps somebody could do a graph showing the maximum % mortgages available in past decades. 80% mortgage means you have to have 20% capital yourself. 90% mortgage means you have to have 10% capital yourself.

    When the world is offering 100% mortgages, then we will again be asking for trouble.


    The Euro as a gravytrain - why attitude matters:

    This applies to recent entrants aswell as those who joined at the outset.

    If within your own country, leading politicians, are of the opinion that joining the Euro is good because it allows you to be lax financially, and have others share the pain, then...

    You are heading for trouble.

    Were you in Ireland or Greece around the time that those countries joined the Euro? Please add something in the comments of this post if you have some first hand experience.

    Did Ireland view the Euro as a way of maintaining the gravy train?
    It may be that Ireland viewed joining the Euro as a way of cosying up to partners who might then be shy, about cutting the generous CAP allowances for fallow fields.

    Apparently Greece falsified some key economic statistics, in order that it might meet the entry requirements for the Euro.

    Banking is an organic system, if you are dishonest, or think that sustaining an economy on generous allowances from other members, is the way to go, then the wheels will come off at some point in the future.



    Borrowing to countries 'extremely high risk' - French Banks:


    Half of the entire Greek debt is made up of money loaned by French banks.

    Why did (and do) these banks loan to people who are considered 'too risky' by other European lenders?


    This is a very different situation from Argentina before it's crisis, where Wall Street and 'private' European investors were the ones doing the ready lending.


    Bonds with no maturity date - Argentina II:

    One of the contributing factors, as to why, the crunch in Argentina, came so late and so hard, was the lack of a by when in loan agreements.

    Bonds have a fixed term.

    When you start 'rolling over' terms or extending the life of a bond, the warning bells should sound.

    This is why with the Greece situation at the moment, a 'rolling over' of debt will rightly be treated as a default marker.

    If misguided financiers choose to pressure bondholders into accepting flexibility around when the bond terminates, then that is just storing up more trouble for the future.

    The financial crisis in Argentina should have happened in 1990 rather than 2000.
    By rolling over bonds and introducing greater flexibility into repayment dates of debt, external financiers were making a bad situation worse.


    Lack of transparency - a risk to any financial system:

    By allowing large financial transactions through private accounts, there are some organisations who, it might be speculated upon, are contributing to financial instability.

    When a government and / or large enterprise has access to a clearing system, that allows huge financial transactions to go unpublished, then the system suffers.

    Examples of transactions that might pass through such a system:
    • Private (and secret) buying of debt from default countries
    • Private (and secret) payments as a result of winning contracts
    • Private (and secret) reserve accounts held by countries who are on notice of economic sanctions.
    Any 'bona fida' banking service which allows transactions of £5000 or more, must register a 'Transaction report' with European or American authorities.


    How companies like Clearstream and Clearstream account holders such as Société Générale and Seimens can justify a service that does not adhere to 'Transaction reporting requirements' is beyond me.

    That is not to say that Clearstream has just European users, there are American companies involved also.


    Notes and Further Reading:

    On a personal level, I have no great affiliation to any American president.

    I am neither a great supporter nor strong detractor of Clinton, Bush, or Obama.

    I simply did some light analysis of past financial performance, and commented on the challenges that the current administration face.

    Can president Clinton claim that he was just following advice? Possibly, although I am sure there are strong arguments for and against such claims.


    I have no intention of reading "My Life", however there must be at least some mention of finance and economy in it somewhere.


    Further Links:
    A comment attributed to Hilary Clinton regarding the current levels of Chinese investment in US:

    China now holds $755 billion in U.S. securities.
    I have no way of verifying the actual figure, a more recent article by the Guardian puts the figure at $1,152.


    Just a final note about rating agencies...

    Fitch, Moodys, Standard & Poor - all US rating agencies right? Wrong.

    Fitch is a subsidiary of the French company Fimalac SA.

    The Bbc program 'All watched over by machines of loving grace', provides an interesting backdrop to the global financial crisis - apparently we asked for it.

    Saturday, July 2, 2011

    cloud providers - patriot what?

    There are at least 10 US cloud appliance* providers who are selling into the UK at the moment.

    Some offer a 'European' region service based in mainland UK or Ireland

    Example Region Location:
    • Rackspace - London & Slough
    • Amazon - Dublin
    • CenturyLink - London and Frankfurt


    European Region - what and why?

    The Rackspace and Amazon European appliance region is a way of reducing latency between the 'home' customers and the appliance.

    In short, if you are a UK company and most of your web traffic comes from UK & Europe, then it makes little sense, to force all that traffic through a transatlantic web pipe to Northern Virginia

    Another reason for a US cloud provider to offer European location, is to offset the argument that, using the cloud is off-shoring outside of Europe.

    By creating jobs in London and Europe, both Rackspace and Amazon appeal to the desire of UK businesses to support some employment, in the country where they sell.

    In Amazon's case there a tax benefit to their choice of Dublin - 12.5% corporation tax.

    For more US cloud providers with a European region, consult the list maintained by CloudHarmony


    Privacy, compliance, and National Security of a Foreign Government:

    Here everything gets a little complicated.

    Some preamble ...

    Firstly, I do not intend to disparage 'the cloud' or any of the cloud providers, who have HQ and corporate tax base in the US.

    Secondly I do use Rackspace & Amazon myself, in amongst a mixed pool of server provision.

    Thirdly I operate servers in the UK and US - I believe that I am fully informed, and aware of the implications of hosting data in those locations.

    I will not cover the patriot act in detail, however as a business, you should be aware of the following...
    If you host data in the US or regional datacentre operated by a company having HQ and corporate tax base in the US, then your data is covered by the patriot act
    What this means in effect is, that providing the US authorities can meet the oversight requirements, that Rackspace, Amazon, and Microsoft might be obliged to hand over any and all of your hosted data, to comply with an access request.

    Quoting from a recent readwriteweb article:
    Microsoft has admitted that it will hand over data to the U.S. government, if properly requested, even if that data is stored somewhere other than the U.S.

    The issue, according to ZDNet's Zack Whittaker, is that because Microsoft is a U.S. company it has to comply with the Patriot Act, and that means handing over data that may be offshore. The same rules would apply to Amazon Web Services and any other U.S. based cloud provider that has servers overseas.
    With server memory price decreasing, and processors getting more powerful, cloud datacentres are placing 50 / 100 / more appliances on each underlying host.

    If just 1 tenant (cloud appliance) is subject to a seizure request, then the data of the other 99 tenants gets carted off with that server also.


    Data protection act and Encrypted Data requirement:

    In the UK there is a mandatory requirement to best practice security of customer data for every UK business, however to my knowledge data on company servers is not currently required to be encrypted.

    Here we now have two things working against each other it seems.

    Using a US headquartered cloud provider increases the requirement for server data encryption ... your customer data must be protected from all external access (disclosed or otherwise).

    However the UK requirement for 'key disclosure' (RIPA part III) could be viewed as a disincentive to businesses use of encryption keys.

    Employees who want to avoid personal responsibility for 'key disclosure', are probably going to be less willing to engage with US headquartered cloud providers.


    Cloud providers with HQ in Europe:

    To my knowledge UK, Germany, France, Canada have nothing even remotely close to the patriot act, and it's non-court order server seizure.

    Here is a sample list of cloud providers:
    • Elastic Hosts (elastichosts.com) [ HQ in Worcestershire ]
    • Memset (memset.com) [ HQ in Surrey Research Park ]
    • VMhosts (vmhosts.co.uk) [ UK plc with HQ in West of Scotland Science Park ]
    • UK2.net (formerly vps.net) [ HQ in London ]
    • CloudSigma (cloudsigma.com) [ HQ in Zurich, Switzerland ] 
    • OVH (ovh.co.uk) [ HQ in Roubaix in Northern France ]


    Notes and Further Reading:

    Rackspace operates data centers Texas, Illinois, Virginia, UK, and Hong Kong

    The European Commission has recently been asked for an opinion regarding several of the points raised in this article.

    In this article I have argued things from my personal point of view, here is a thorough counter argument, by SurveyMonkey, to some of the points I touched upon.

    Just one comment on the SurveyMonkey blog article...
    There is a distinction between applying for a warrant and court process, versus the FBI non-court authorised server seizures, that have happened under the banner of the US Patriot act.

    2014 update: The recent case with Microsoft being ordered by a US court to hand over data stored on servers in Dublin illustrates the problem is still an issue 3 years after this blog post was first published.

    Judge Preska quote: "It is a question of control, not a question of the location of that information"

    I hope the message from this article is not Anti-cloud, as that is not my intention. If I were to repeat one line from this article as a summary it would be:
    When choosing a cloud provider, do your research
    If Scotland or Iceland manage to really sell and deliver the 'lower cooling' promise, then perhaps these locations might be future cloud provider HQ, or Regional datacentres.

    Friday, July 1, 2011

    python gdbm, like rrd, is 64 bit specific

    If all your machines are 64 bit installs, then this issue is not something you will likely come across.

    However if you have some 32 bit and some 64 bit and try and take mydb.gdbm across from 32 bit to 64 bit machine, or vice versa, then look out.

    gdbm is wordsize specific. That means that your sample 10 row database created on a 32 bit machine, will look different than the same 10 row database created on a 64 bit machine.


    gdbm fatal: lseek error

    Not the most helpful message, but should you encounter this message then you have probably hit the issue I described above.

    Solution: Create your data on the appropriate architecture machine.

    Round robin databases (.rrd) have a similar issue and this restriction is long understood.

    GDBM files are not portable between different architectures.

    If a database being wordsize specific, is a serious limitation for you, then gdbm and rrd are probably not the technology you should invest your time in. Perhaps sqlite or mongodb are alternatives you might consider.


    Notes and Further Reading:

    gdbm was originally created as part of the GNU project.

    The GNU project acted as an incubator and first implementation for this project, which has now gone on to maturity / stability.

    sample data in postgres

    This article takes some sample data and imports it into Postgresql (i) using CSV import and (ii) using INSERTs from an export of MySQL

    Now to get the data into postgres we could go back to the spreadsheet (csv loading approach) or maybe take a compatible export from mysql and load that (compatible loading approach)

    The data we will be using can be found here and I am about to start working with this file.

    ( If you are coming from a MySQL background and want to just load the sample 'employees' database into Postgresql, then project at  this link will help with that )

    CSV loading approach:

    In postgres use COPY FROM to get data from the filesystem into the database.

    But wait; surely you need to already created a table so as to have a table to load into?
    Yep, otherwise you would see postgres complain about a missing relation like so...

    postgres=# COPY amd_bang_per_watt FROM '/tmp/amdAM3clockSpeedsAndWattage__200907.csv' WITH CSV QUOTE AS E'\042';
    ERROR: relation "amd_bang_per_watt" does not exist


    The table does not exist so you need to 'create table'.

    I cover this in more detail in the next section 'Compatible loading approach' so either consult there or have a go yourself first perhaps.

    I now assume you have the table amd_bang_per_watt created.

    My table needed emptying first but you can likely ignore the next command:

    : #root@156ns1(~) ;echo 'delete from amd_bang_per_watt' | psql amd_power_dissipation postgres
    Password for user postgres:
    DELETE 50

    ...and pick things up again here where I will try the COPY FROM:

    amd_power_dissipation=# COPY amd_bang_per_watt FROM '/tmp/amdAM3clockSpeedsAndWattage__200907.csv' WITH CSV QUOTE AS E'\042';
    ERROR: value too long for type character varying(10)
    CONTEXT: COPY amd_bang_per_watt, line 1, column model_family: "Model Family"
    amd_power_dissipation=#

    and taking care of the header we execute

    amd_power_dissipation=# COPY amd_bang_per_watt FROM '/tmp/amdAM3clockSpeedsAndWattage__200907.csv' WITH CSV HEADER QUOTE AS E'\042';
    ERROR: null value in column "speed_power_ratio" violates not-null constraint
    CONTEXT: COPY amd_bang_per_watt, line 4: ""X2 ","6??? x¹ ",2.3,,,"3.4 HT3 ",,,45,"x¹ 2009 ","
    amd_power_dissipation=# select count(*) from amd_bang_per_watt;
    count
    -------
    0
    (1 row)

    which is still not completing as we would wish so attempt 3:

    amd_power_dissipation=# COPY amd_bang_per_watt FROM '/tmp/amdAM3clockSpeedsAndWattage__200907.csv' WITH CSV HEADER QUOTE AS E'\042' FORCE NOT NULL speed_power_ratio;
    ERROR: invalid input syntax for type numeric: ""
    CONTEXT: COPY amd_bang_per_watt, line 4, column speed_power_ratio: ""

    ...attempt 4...

    amd_power_dissipation=# ;COPY amd_bang_per_watt (1,2,3,4,5,6,7,8,9,10) FROM '/tmp/amdAM3clockSpeedsAndWattage__200907.csv' WITH CSV HEADER QUOTE AS E'\042';
    ERROR: syntax error at or near "1"
    LINE 1: COPY amd_bang_per_watt (1,2,3,4,5,6,7,8,9,10) FROM '/tmp/amd...

    ...attempt 5...

    amd_power_dissipation=# COPY amd_bang_per_watt (model_family,model,clock_speed,l2cache,l3cache,ht_bus_ghz,voltage,socket,tdp_watts,process_comments) FROM '/tmp/amdAM3clockSpeedsAndWattage__200907.csv' WITH CSV HEADER QUOTE AS E'\042';
    ERROR: extra data after last expected column
    CONTEXT: COPY amd_bang_per_watt, line 2: ""X2 II ","550 Black Edition ",3.1,"2x512k ","6MB ",2,"1.15-1.425 ","AM3 ",80,"45nm Callisto Q3-2009 ..."

    ...attempt 6...

    amd_power_dissipation=# COPY amd_bang_per_watt (model_family,model,clock_speed,l2cache,l3cache,ht_bus_ghz,voltage,socket,tdp_watts,process_comments,process_comments) FROM '/tmp/amdAM3clockSpeedsAndWattage__200907.csv' WITH CSV HEADER QUOTE AS E'\042';
    ERROR: column "process_comments" specified more than once

    ...and still no success so back to the drawing board (or rather sed in fact):

    sed 's/\,$/\,0/' <> amdAM3clockSpeedsAndWattage__200907.csv-truncated
    : #root@156ns1(tmp) ;sed 's/\,$/\,0/' <> amdAM3clockSpeedsAndWattage__200907.csv-edited
    : #root@156ns1(tmp) ;sed 's/\,\,\,/\,0\,\,/' <> amdAM3clockSpeedsAndWattage__200907.csv-edited2

    Here is some dialogue to explain how I came to have three sed commands of which the first is now redundant:

    amd_power_dissipation=# COPY amd_bang_per_watt FROM '/tmp/amdAM3clockSpeedsAndWattage__200907.csv-truncated' WITH CSV HEADER QUOTE AS E'\042';
    ERROR: missing data for column "speed_power_ratio"
    CONTEXT: COPY amd_bang_per_watt, line 4: ""X2 ","6??? x¹ ",2.3,,,"3.4 HT3 ",,,45,"x¹ 2009 ""
    amd_power_dissipation=# COPY amd_bang_per_watt FROM '/tmp/amdAM3clockSpeedsAndWattage__200907.csv-edited' WITH CSV HEADER QUOTE AS E'\042';
    ERROR: null value in column "clock_speed" violates not-null constraint
    CONTEXT: COPY amd_bang_per_watt, line 10: ""X2 ","6??? x¹ ",,,"2MB ","DDR2 ",,,,"x¹ 65nm Kuma Q2/2008 ",0"
    amd_power_dissipation=# COPY amd_bang_per_watt FROM '/tmp/amdAM3clockSpeedsAndWattage__200907.csv-edited2' WITH CSV HEADER QUOTE AS E'\042';
    COPY 50

    and now a few selects to check things look okay:

    amd_power_dissipation=# select count(*) from amd_bang_per_watt;
    count
    -------
    50
    (1 row)

    amd_power_dissipation=# select count(*),substr(process_comments,1,4) as nm from amd_bang_per_watt group by nm;
    count | nm
    -------+------
    2 | x¹
    3 | x¹ Q
    1 | x¹ 4
    6 | x¹ 2
    19 | 65nm
    15 | 45nm
    4 | x¹ 6
    (7 rows)

    All looks well here.

    The CSV files (original, and edited using sed) are listed below:
    If recommend saving the above files to your filesystem and viewing them from there, however if you do instead open them directly in your browser and you see x¹ rather than x¹ then try using Konqueror instead.

    Konqueror is a clickable install on most Linux desktop distributions and is available to Windows users via KDE on Windows project.


    Compatible loading approach:

    Here is an extract of some ruby code which I looked up, so as I have an idea of what mysql types map to what postgres types:

    def convert_type(type)
    case type
    when "tinyint(1)"
    "boolean"
    when /tinyint/
    "tinyint"
    when /int/
    "integer"
    when /varchar/
    "varchar"
    when /decimal/
    "decimal"
    else
    type
    end
    end
    You might find a more accessible read in the form of the Mark Slade migration guide for mysql to postgres. That guide includes some discussion about datatype mappings.

    What I wanted to check was that decimal in mysql would mean decimal in postgres, and, yes it seems that it does.

    One recommendation for using decimal fields in any database is to always specify both parameters rather than relying on database defaults.
    That way if you do end up exporting/importing as part of some migration, then you will not be tripped up by differing defaults for precision.

    Now in part 3 of the 'sample data and mysql' postings I used --compatible=postgresql flag to produce these files:
    Having following the mysqldump documentation link through to reading about server modes it seems that a reasonable expectation is as follows:
    • Mysql is going to do what it can to help you when you ask for --compatible=somerdbms
    • Mysql will try and avoid giving you sql output it knows for sure will cause somerdbms a problem.
    • Mysql does is not promising 100% compatibility but just goes some way to making your task less onerous.
    In particular you should not expect --compatible=postgresql to spit out sql which postgres can feed on right away - there is still some work to do.

    To give a quick illustration I took the .mysqldump4postgresSansOpt file and removed the row inserts and tried to get psql to execute it:

    cat /tmp/amd_bang_per_watt.mysqldump4postgresCreateOnly | psql amd_power_dissipation postgres
    Password for user postgres:
    ERROR: syntax error at or near "@"
    LINE 1: SET @saved_cs_client = @@character_set_client;
    ^
    ERROR: unrecognized configuration parameter "character_set_client"
    ERROR: syntax error at or near "COMMENT"
    LINE 12: ...power_ratio" decimal(5,2) NOT NULL default '0.00' COMMENT 'b...
    ^
    ERROR: syntax error at or near "@"
    LINE 1: SET character_set_client = @saved_cs_client;

    As you can see postgres is not happy.

    So I clean it up a bit and retry:

    : #root@156ns1(~) ;cat /tmp/amd_bang_per_watt.mysqldump4postgresCreateOnlyCleanedUp | psql amd_power_dissipation postgres
    Password for user postgres:
    CREATE TABLE

    ...better. Now what I had to do was get rid of the column comment for speed_power_ratio and get rid of the set type stuff mysql had placed before and after the create table block.

    Here are the files as I worked on them:
    So we have the table - lets get on run the inserts as follows (abbreviated):

    : #root@156ns1(~) ;cat /tmp/amd_bang_per_watt.mysqldump4postgresInsertsOnly | psql amd_power_dissipation postgres
    Password for user postgres:
    INSERT 0 1
    ...
    INSERT 0 1

    The 'INSERT 0 1' feedback above is what we should expect given our setup here.
    Further reading of the postgres documentation for INSERT and scanning down for oid should make things clear.

    A few quick selects to see if things look okay:

    postgres@ns1:~$ psql
    Password:
    Welcome to psql 8.3.7, the PostgreSQL interactive terminal.

    Type: \copyright for distribution terms
    \h for help with SQL commands
    \? for help with psql commands
    \g or terminate with semicolon to execute query
    \q to quit

    postgres=# \l
    List of databases
    Name | Owner | Encoding
    -----------------------+----------+----------
    amd_power_dissipation | postgres | UTF8
    postgres | postgres | UTF8
    template0 | postgres | UTF8
    template1 | postgres | UTF8
    (4 rows)

    postgres=# \dt
    No relations found.
    postgres=# \c amd_power_dissipation
    You are now connected to database "amd_power_dissipation".
    amd_power_dissipation=# select count(*) from amd_bang_per_watt;
    count
    -------
    50
    (1 row)

    amd_power_dissipation=# select count(*),substr(process_comments,1,4) as nm from amd_bang_per_watt group by nm;
    count | nm
    -------+------
    4 | x¹ 6
    19 | 65nm
    2 | x¹
    3 | x¹ Q
    1 | x¹ 4
    6 | x¹ 2
    15 | 45nm
    (7 rows)

    amd_power_dissipation=# select model,clock_speed,l3cache,tdp_watts,speed_power_ratio from amd_bang_per_watt where speed_power_ratio > 27;
    model | clock_speed | l3cache | tdp_watts | speed_power_ratio
    --------------------+-------------+---------+-----------+-------------------
    550 Black Edition | 3.1 | 6MB | 80 | 38.75
    545 | 3 | 6MB | 80 | 37.50
    720 Black Edition | 2.8 | 6MB | 95 | 29.47
    710 | 2.6 | 6MB | 95 | 27.37
    705e | 2.5 | 6MB | 65 | 38.46
    810 | 2.6 | 4MB | 95 | 27.37
    900e | 2.4 | 6MB | 65 | 36.92
    905e | 2.5 | 6MB | 65 | 38.46
    910 | 2.6 | 6MB | 95 | 27.37
    945 | 3 | 6MB | 95 | 31.58
    9100e | 1.8 | 2MB | 65 | 27.69
    9150e | 1.8 | 2MB | 65 | 27.69
    (12 rows)

    Sidenote: Having tried the substr() funtion, it seems that the sql statement shown below works unaltered in both mysql and postgres:

    select count(*),substr(process_comments,1,4) as nm from amd_bang_per_watt group by nm;


    Conclusion:

    The --compatible option of mysql goes some way to getting your data into postgres and this article hopefully gives a flavour of what manual steps you might have to take.

    Having worked through both options for getting the data in (csv and --compatible), I have to say that I am much in favour of using --compatible=postgresql and going that way.