Saturday, December 26, 2009

Ubuntu on Dell Laptops - Auto shutdown at 5% battery

If your Dell Laptop came preinstalled with Ubuntu it probably had nice
shutdown setting from Day 1.

If any of the following apply then you may not have the auto shutdown setup ideally:
  • You installed Ubuntu on your Dell Manually
  • You have upgraded your version of Ubuntu
  • The battery you are running with is not the original
(Explanations for the above are given at the end of this posting)

By default, the settings on my preinstalled Dell, were such that at 2 minutes battery life remaining, the thing should shut down.

I now show some settings which I would recommend and then comment at the end regarding the defaults.

Firstly change
use_time_for_policy


so it is not checked...

(the images can be clicked to make them zoom in bigger if on your screen they blur)


...which has the effect of making your system look at percentages instead.

Now change the percentages to the following numbers:
And now your system is set to Auto shutdown at 5% battery.

Some (optional) extra reading now follows, regarding the original settings.

Ubuntu/Gnome has three thresholds which are 'action', 'critical', 'low' and in normal running your system would hit the 'low' threshold, then if still draining you would hit the 'critical' threshold.

Finally when the battery is almost out of juice entirely, the 'action' threshold (the point at which your system will shutdown) is encountered.

Here are the unchanged settings that my system had originally:

...and frankly they are less than ideal for several reasons which I now elaborate upon.

(1) Firstly by using time_action, time_critical, time_low the system is relying on estimations which are at best rough guesswork. By changing use_time_for_policy to be unchecked we are now avoiding this time based stuff.

Time based thresholds work well if you are running an unchanged system and your batteries are well know to the system*

(*Batteries in regular use which have had a couple of full charge and discharge cycles are just fine)

If however your batteries are replacements and/or the system has not got enough historic information, then any time remaining estimates are very much guesswork and using time thresholds may be unreliable.

If you have upgraded your system (and/or installed new versions of power manager) then the battery history information may have been emptied out. Without history the power manager is pretty helpless at the time based stuff.

(2) Having auto shutdown at 2% is in my opinion skirting a bit close to the edge. Alright you might have a battery that really does report it's percentage remaining very well. Even so for a 3 hour battery (pretty typical) 2% remaining is 3.6 minutes which is not a lot really.

As your battery ages it's percentage reporting might become less reliable also. Having a setting of 2 minutes (time_action = 120) is not a setting that I would like either.

Having 5% for the percentage_action and 7% for the percentage_critical gives me 3 to 4 minutes to save my work nicely before the system does an auto shutdown.

I want to be warned at 15% (around half hour to go) so that I can get to somewhere with a socket and plug in.

These settings are just my preferences, and you will have your own ideas of good percentages to use that suit your own working style.

Sunday, December 13, 2009

Virgin Broadband and Digital Photography - No good

This article is intended to make clear some limitations of the Virgin Broadband offering with regard to Digital Photography.

I recently graduated in Mathematics and the family took some good pictures, and a few short 1 or 2 minute movies.

Nothing too fancy, just handheld digital cameras, rather than camcorders or things that produce huge files.

What I discovered today is that Virgin Broadband have a throttling policy that kicks in once you upload 200 megabytes.

Now just to put the numbers in modern terms - A new £150 digital camera is probably going to give you a 10 megapixel shot, and result in a photo that takes somewhere between 5 and 10 megabytes of space on the computer.

Being conservative, and assuming they are all 5 megabytes that means that 200 megabytes of photos (40 snaps in all) is going to trigger Virgin's throttler.

Virgin Throttler - about
Here are a few bits of information that I was able to glean from talking to the call centre guy this afternoon:
  • Throttling checks are in place every day 9am->9pm
  • Once applied, any throttle will stay in place for 5 hours.
  • 200 megabytes of upload is enough to trigger throttling
  • Throttling applies in both directions, so fall foul of uploading limit, and your downloading is going to drop to snailpace also.
Summary Comment: Try and upload 40 photographs anytime between 9am and 9pm and your connection will suffer.

Now back to my particular situation....I just wanted to make those couple of minute movies available to the family so thought I would stick them on a website server...



(Click the image above for a full-size version if it appears a bit blurred)

The important information is the KB/s figures which at best show 30KB/s and at worst show - stalled - , or crummy speeds of between 10KB/s and 20KB/s.

In practical terms that 75 meg 1 or 2 minute movie is taking over an hour to upload:



Here are the files in a file manager just to confirm the amount of data I was trying to send:

As you can see the total upload is somewhere around 350 megabytes - the exact figure is not so important.

Even given the information about virgin throttling, the 30KB/s (fastest ahem) which I was getting originally is pretty poor by todays standards.

Here is an extract from the conversation with the broadband help today:

Me: "So the throttling has kicked in because it is a weekend"

Reply: "No because it is peak time - I suggest you try after 9pm"

Me: "And peak time is when?"

Reply: "9am to 9pm each day"

Me "But that is half of the day, and of the remaining half I am asleep for 8 of those hours"

Reply: "That is the traffic shaping policy that was explained when you signed your agreement"

Me: "It never said anywhere that if I upload 200 megabytes you will throttle my connection speed"

...and so on...

The message I think is clear for me anyway, if you are serious about digital photography then take the time to question your proposed broadband provider, so as to make sure that uploading is going to be workable for you.

If you have a 10 megapixel digital camera and you are thinking about switching to Virgin Media Broadband, then you might want to call them up and check what throttling will apply to your broadband package before you sign.

Monday, December 7, 2009

Twitter behaviour - rate limit message back from Flock

Twitter is pretty well behaved most of the time. Just occasionally it gives the 'Too many Tweets' or perhaps some new functionality seems a bit hit or miss (lists today)

I use 3 or 4 different browsers on my desktop, and play mostly Firefox 3.5 but do dabble in other browsers like Flock.

Here is where twitter comes in....

The install of Flock 2.5.5 is a manual install I did a couple of weeks back and I start it from a terminal like so:
/opt/local/flock255/flock-browser
...and here is todays mystery output:


...which will be too small to read until you click on the image above to enlarge it.

The key message is in the final line and says:
Rate limit exceeded. Clients may not make more than 150 requests per hour.
Now I didn't know that Twitter limited access requests this way, but I suppose the 'denial of service' attacks they suffered in mid 2009 did force their hand a bit.

Anyway if you are using Flock, then you might need to be aware that the cunning sidebar thing Flock does to show you 'following' and 'retweets' might sometimes throw you over Twitters API limit.

This is just my interpretation, and if you know different, then please comment and I'll update this post as appropriate.

Note: In Flock's output (shown in the clickable image in this post) I substituted 12345678 for my Twitter account number for obfuscation.

Thursday, November 5, 2009

Managing your Ubuntu - every release or LTS?

If you are new to Ubuntu then you may not yet have thought about how
to go about upgrading to the next version.



If you have installed Ubuntu yourself (rather than it coming preinstalled) then you need to be aware of this screen:


This screen is particular to Linux/Ubuntu (other non-Linux systems are unlikely to offer 'free upgrades' with no trial period or catches)

As an Ubuntu user you should only press the 'Upgrade' button at the top if that is really what you want to do.
If not then 'Install Updates' and keep running the Ubuntu version you are on.

If you are choosing to stay where you are Ubuntuwise for now, then you are able to stop that 'Upgrade' button from appearing on the screen as follows:

Press the 'Settings...' button and alter the section 'Release Upgrade' in there
...or...
(for command line folks)
cat /etc/update-manager/release-upgrades


...read from the following...
# default prompting behavior, valid options:
#  never  - never prompt for a new distribution version
#  normal - prompt if a new version of the distribution is available
#  lts    - prompt only if a LTS version of the distribution is available

...set
Prompt=never
...or whatever your preference is using an editor.

I now begin a discussion about some future choices regarding upgrades, by using a couple of example systems.
(If you came to this post just to learn how to disable the upgrade prompting then you might want to switch off now)

The two Dell laptops that will be the example systems are as follows:
  1. Inspiron 1525 Silver preinstalled Ubuntu 7.10 Gutsy


    (Now '4th generation'* Ubuntu)
  2. Inspiron mini9 Black
    (Currently '1st generation'* Ubuntu)
*I explain what I personally mean by NNth generation in the next few paragraph

Computers are generally sold with a computer system pre-installed.
There are exceptions and the freedom to ask for 'no install' does have
some value I think.

In the last 10 years if you did buy a pre-installed computer, then you have exchanged money, not only for the physical parts, but also for OS it came preinstalled with.
There is a support element usually, but such support only covers the unit as originally supplied.

Example: Five years ago you bought a laptop with Windows XP. When you bought it you had some reassurance that problems with the operating system would be resolved either with a reinstall disk or by the supplier directly.

Example2: Several years ago you bought a laptop with Mac OS X 10. Same deal as just described.

Having owned either of these laptops for some time, you might have been tempted to try and upgrade (XP to Vista perhaps or maybe OS X to a newer OS X). The choice of when and if to upgrade is something you made based on your own perceived need.

Maybe you wanted a sexy new feature, maybe a pal showed off a new application and you needed a new OS version to get it, perhaps you had something that was bugging you about the old setup.

Similar motivations might occur whatever Operating System you run - in my case Ubuntu.

The Inspiron 1525 is '4th generation' as the version of Ubuntu which I have run on it are: (i) 7.10 Gutsy, (ii) 8.04 Hardy, (iii) 8.10 Intrepid, and (iv) 9.04 Jaunty [ today ].
(This machine is my testing rig really and I choose to keep it constantly upgraded to the latest release)

The Inspiron mini9 Black (mini9) is '1st generation' (never been upgraded) as it is a general machine and its focus is stability and consistent operation (how most people run their regular machine I think)

When you first come to use Ubuntu you may well be looking mainly for 'stability and consistent operation', and perhaps may never want to upgrade.
Perhaps stay on the version of Ubuntu you are on for at least a year and then reassess things then.

For system suppliers there is a mechanism in place regarding infrequent upgrades termed LTS. If you are a Laptop system supplier then you probably already know about Ubuntu 'Long Term Support', and will be making plans for the next LTS release in April 2010 which includes 3 years support.

(The mini9 will probably probably stay on its current Ubuntu for some time yet and I would not be pressing the 'Upgrade' button). Note: This does not make my mini9 system insecure, as it still benefits from regular security updates.

I hope the later half of this post has introduced you to the options available to you regarding upgrades. It really is a matter of personal choice when and if to upgrade - after all it is your system!

Note: The laptop images in this post are intended to represent physical machines that I own, but are Dell's own promotional images (copyright Dell)

If you read other articles of mine, then you might know that currently I run Debian, Ubuntu, and CentOS for a variety of project tasks. I intend to write articles about all of these systems as subjects arise.

Final points for extra clarity:
Setting Prompt=never in the file /etc/update-manager/release-upgrades will mean that you still get informed about updates (security, whatever) but that you are never prompted to upgrade to a newer version of your OS.

Updating: Your OS gets security updates for 18 months to 3 years*
Upgrading: Think Windows XP -> Windows Vista if it helps you with the concept

*18 months to 3 years is clarified further in the first few paragraphs of this "what is ubuntu" page

Tuesday, October 20, 2009

One shed or several sheds - DVDram

The title needs a bit of explaining and I do that next.
( If you want to skip straight to the factual bits then skip down a page now. )


When it comes to how you organise your storage, people like to do things in different ways. I use a physical scenario to aid this introduction to DVDram.

(DVDram is how I refer to things in this article, as visually it seems to stand out and it is unlikely to be confused with the more widespread dvd standards you already own perhaps)

You live with a partner and have a large back garden area, and your partner says hey you can have half of that space out back and make some storage for yourself - they have some great wooden sheds down at the DIY place how about it?

I'm just going to give some dimensions to add a bit of realism here - so the area you can have is 10 metres by 20 meters with free access on both of the longer sides.

Question: Do you go for one single shed or several sheds?

You price things up and it turns out that you can cover that whole area with one single shed for £900 or alternatively 3 individual sheds for 3 * £300 = £900.

So pricing there is no difference, it really is a case of personal preference.

Here is where your experience comes in...
  1. Are you the type to break up a storage closet into shelves?
  2. Do you prefer to spend less time organising up front and are happy to trawl through your boxes later on?
Where DVDram wins, for my mind, is if you answered YES for (1) and NO for (2) and I make some bullet points here:
  • Appendable without having to erase what is already on the disk.
    [ CDRW multi-session achieved this but some find it a bit of a pain ]
  • An alternative to having a box full of 4gig usb sticks
    [ A whole bunch of 4gig usb sticks might have quicker writes but have you tried writing readable labels on something so small? ]
  • An alternative to a single huge hard drive for backups (if you prefer the hard drive way then you might have answered NO for (1) above)
Drawbacks of DVDram:
  • Best suited to write a lot, append a bit sort of data*
  • Great for offline storage for things that are genuine snapshots but only consulted in dire emergencies.
*I give an example next to illustrate what I mean by write a lot, append a bit.

I have a 2 Gigabyte Audio Player...


...and as with all flash solid state memory, it will stop working at some point.

"Backup your work, backup your life!"

...are what we hear and so several times a year I have been taking a snapshot of this drive in tar form or as squashfs.

And so each time I visit Amazon for some mp3 songs, if I then took another snapshot this would be overkill. But the new songs should be appended somehow.

Howto:
  1. --append option of tar will do the trick
  2. Create a new squashfs (seems a bit wasteful to do this but okay for irregular full snapshots approach to backing up).
  3. DVDram
  4. USB key (if you insist)
  5. I have a super huge X Terrabyte hard drive so who cares.
    (you may not have answered YES for (1) and NO for (2) in the wee test)
So, going the DVDram way, is there anything you should know? Well firstly it is going to take 20 to 30 minutes to write 2 Gigabytes when you first use a blank DVDram.

The 20 minutes it takes is why I was particular in saying I use it for 'write a lot, append a bit' earlier.

So having backed up my Audio Player once, I just keep appending the new stuff (using drag and drop if you like).

Adding new songs to my backup is really quick and no hassle (Unlike regular DVD rewriteables I have no need to erase each time)

Takes about a minute to write 50 megabytes of new songs in my experience.

Here are some thing that I was most surprised about regarding DVDram:
  • Not only is a DVDram drive affordable, but lots of you already have one without realising! A modern SATA dvd writer will often also read and write DVDram.
  • The discs are the same prices as rewriteables so you will likely get change from 2 whatevers (insert your currency here)
If your still wondering about the cost of a DVDram drive after reading the above, then just to clarify - I just purchased a Samsung SATA DVD writer for £25 in UK and it does the job just fine.

Q&A:
  • Does the disk spin constantly? NO. When you are using the data on it it will spin (as would a hard drive).
  • But surely a DVDram drive is expensive? About the price of a glass of beer.
  • The spinup time for DVDram must be an age? Err no, 4 seconds here.
  • You cannot drag and drop onto it? Yes you can. I use Thunar - it's good to go.

Footnotes and further reading:

If you are interested in Amazon Drmless mp3 and are running Debian 5 or newer then there are some instructions here on how to proceed. In short you would need to get amazonmp3.deb from Amazon and ensure you have some libboost stuff also.

DVDram uses the UDF filesystem so has no issues with huge files (over 2 Gigabytes) whereas regular DVD media would likely autoswitch from iso9660 to UDF in that instance.

The optical drive in your laptop may well read DVDram okay. Mine is a Dell Inspiron 1525 bought in 2008 and it works great. Dell made no mention of it in the specification I think. The laptop is an N model and came with Ubuntu preinstalled.
[ Drive was identified by K3b as TSSTcorp DVD+-RW TS-L632H (details here) ]

The optical drive in your laptop may well not be any good at writing DVDram. Some manufacturers may not have taken the trouble to enable the DVDram feature, even though the drive is perfectly capable.

Comments welcome. If you are already a firm fan of your current backup system then stick with it. After all if it works then that's great.

If you have TSSTcorp TS-L632H but cannot get it to read DVDram then perhaps you might try either or both of the following:
  • Use 3x speed Traxdata DVDram 4.7gb single sided (120 minute) disc
    (Then you would be using the same disc I tested with)
  • If you are not already using Ubuntu or Debian then try a live disc with "Make no changes to my computer" startup option, and see if you get a better result whilst running live disc.

Saturday, September 19, 2009

renaming user accounts - why not?

This post is intended as a short investigation into the effects and pitfalls of manually renaming a user account from the command line.
Saying "don't do it" is all well and good but I hope to explain some of the trickiness that is involved, if you insist on doing it, and also say a bit about what will be broken.

Okay so here is the scenario using fictional characters from tv/movies:












When Harry met Sally they got all romantic and then after a few years they talk about new laptops.

Harry gives his Ubuntu Linux laptop to Sally and says it's all setup for the internet and everything and my password is 12344321. Just rename the harry account to sally and your good to go.

Sally is comfortable with the command line and wants to try out Harry's suggestion as an exercise. (from now on sally is in pink as a visual aid)
I assume from here on in that you know the VI editor or Emacs editor (my preferred) but will just refer to VI in this post.

*Warning: This post is of the 'do not do it' kind and should you follow any and all of it's numbered steps you may render your user account unusable. If you are in a hurry then skip the long lesson and just read about usermod at the end of the post*

(1) Use vipw to change the user account name from harry to sally.
You will also have to then do the same using vipw -s command.

(2) Use vigr to change all occurrences of harry to sally.
You will also have to then do the same using vigr -s command.

If you were to try and log in between steps 1 and 2 above, then you would be able to login but the privileges you would have as a user would not be as you expect them to be.

Now you have dealt with the login side of things, you login as yourself (sally) and see that issuing pwd to show current working directory displays /home/harry (which does not look right at all!)

Worse still a quick peek in /home/harry/.sane/xsane/xsane.rc gives a horror story illustrating why some configuration files are going to break for sure if you rush into step (3):

sally@laptop:~$ fgrep -B1 -A1 harry /home/harry/.sane/xsane/xsane.rc
"working-directory"
"/home/harry"
"filename"
"/home/harry/tmp/scan0006.tiff"
"filetype"
--
"fax-project"
"/home/harry/faxproject"
"fax-command"
--
"e-mail-project"
"/home/harry/emailproject"
"e-mail-filetype"
--
"multipage-project"
"/home/harry/multipageproject"
"multipage-filetype"
sally@laptop:~$


(3) mv /home/harry /home/sally
Do not do the above please :- here is where I decide to abandon my initial approach and consider things a bit more carefully.

Essentially if this were a server then you might have a slightly better chance of attacking this problem using the blunt approach detailed so far. However this is a laptop/desktop machine and so unless you really are a Gnome/KDE/WindowMaker/GNUstep expert then you best just stay well clear of trying steps (1), (2), (3).

In fact I did just follow steps (1) and (2) on my machine and so I am going to reverse what I did, to put things back the way they were, before proceeding further.

Now here I review my real aim and state how I will achieve it:

Aim: To keep the system uid number that harry owned and all the group privileges but have everything appear marked sally. All data files (not configuration files) from harry to be made available and owned by sally.

Solution Approach (long way round):
  1. Create a fresh sally account before you start hacking around yourself.
  2. Give the new account a password of your choice and log in to that account in Gnome, KDE, WindowMaker/GNUstep (whatever desktops you use)
  3. Copy the data files from /home/harry to /home/sally
  4. Issue the cd and find commands shown below***
  5. Using vipw and vipw -s commands (i) make harry have home of /home/sally (ii) swap the system uids of harry and sally.
  6. Using vigr and vigr -s commands change all occurrences of harry to sally ensuring you deal with any duplicate situation. (Changing audio:x:29:pulse,harry,sally to be audio:x:29:pulse,sally,sally would be a duplication situation and just remove the second occurence of sally so it now reads audio:x:29:pulse,sally [ Note: Do not make any alterations to the first field (to the left of the first colon) ]
  7. Now use vigr and vigr -s to deal with just the first field so interchanging harry for sally (only if they appear in the first field on your system setup)
  8. Login to sally using password 12344321 and check everything seems okay.
cd /home/
find sally/ -type d -user sally -exec chown harry {} \;
find sally/ -type f -user sally -exec chown harry {} \;
(You might also want a further two find commands that make use of the -group switch but I leave that as an exercise for the reader)

Assuming everything went okay you are free to remove the harry account altogether when you see fit.

An alternative approach to all of the above, that should work, and is perhaps a lot easier is to use involves the usermod command. You would however have to peek in /etc/passwd to get harry's system uid and manually remove the harry account before issuing usermod so as to avoid non-uniqueness issue:-
The uid given with the -u option is already in use. (from unix command)
(usermod on Linux has -o option if you want to do the usermod above a different way)

I hope this post illustrated a few reasons to not just dive in and start reassigning things between user accounts without careful consideration.

If you already did and you have websearched here because you are stuck, then thoroughly understanding what goes on in steps 1 to 7 above might help you figure out a way of backing out from what you have done :)

Where did the fictional characters pictured/linked above come from:

Tuesday, August 4, 2009

postgres - xml out (part 3 of 3) table_to_xml

Earlier in the 'postgres xml out' articles we used query_to_xml() to get data out of postgres in xml format.

Postgres also has the function table_to_xml() which is perfect for exporting the whole table in xml format.

Here is the function signature from the postgres documentation page:
table_to_xml(tbl regclass, nulls boolean, tableforest boolean, targetns text)

The first parameter will be 'amd_bang_per_watt' for all the examples in this article.
The fourth parameter I will not be making use of and so '' will be supplied.

The parameters that I will discuss (and vary for illustration) are the boolean second and third parameters, nulls and tableforest respectively.

Four examples of this output are listed below marked with the values I used for the booleans null and tableforest:
  1. amd_bang_per_watt-table_to_xmlFFheaderless.xml (tbl,False,False,'')
  2. amd_bang_per_watt-table_to_xmlFTheaderless.xml (tbl,False,True,'')
  3. amd_bang_per_watt-table_to_xmlTFheaderless.xml (tbl,True,False,'') ***
  4. amd_bang_per_watt-table_to_xmlTTheaderless.xml (tbl,True,True,'')
Option 3 has been marked with *** to indicate this is my preferred output form and I will be using the xml this form produces in the next paragraphs and in some later articles.

Using True for the parameter
nulls boolean
...gets you...
<voltage xsi:nil="true"/>
rather than omission (the voltage node being omitted altogether).

Using False for the parameter
tableforest boolean
means that each row will be wrapped in a
<row> and </row>
pairing which is nice and convenient.

Executing suitable queries in psql and chopping off the head and tail will produce files similar to the four files listed above. An example of *** (Option 3) is shown here:

psql amd_power_dissipation -c "select table_to_xml('amd_bang_per_watt',true,false,'');" | \
tail -n +3 | head -n -3 > amd_bang_per_watt-table_to_xmlTF.xml;


...and the commands in full are in this text file.

Extracting the 1st row, 47th row, and 50th (final) row of sample data from the file amd_bang_per_watt-table_to_xmlTFheaderless.xml, I create row1headerless.xml containing:
<row>
<model_family>X2 II </model_family>
<model>550 Black Edition </model>
<clock_speed>3.1</clock_speed>
<l2cache>2x512k </l2cache>
<l3cache>6MB </l3cache>
<ht_bus_ghz>2</ht_bus_ghz>
<voltage>1.15-1.425 </voltage>
<socket>AM3 </socket>
<tdp_watts>80</tdp_watts>
<process_comments>45nm Callisto Q3-2009 </process_comments>
<speed_power_ratio>38.75</speed_power_ratio>
</row>
and row47headerless.xml containing:
<row>
<model_family>X4 II </model_family>
<model>965 Black Edition </model>
<clock_speed>3.4</clock_speed>
<l2cache>4x512K </l2cache>
<l3cache>6MB </l3cache>
<ht_bus_ghz>2</ht_bus_ghz>
<voltage>1.15-1.425 </voltage>
<socket>AM3 </socket>
<tdp_watts>125</tdp_watts>
<process_comments>x¹ 45nm Deneb Q3-2009 </process_comments>
<speed_power_ratio>0.00</speed_power_ratio>
</row>
and row50headerless.xml containing:
<row>
<model_family>FX </model_family>
<model>FX-82 x¹ </model>
<clock_speed>2.6</clock_speed>
<l2cache>4x512KB </l2cache>
<l3cache>2MB </l3cache>
<ht_bus_ghz>0</ht_bus_ghz>
<voltage xsi:nil="true"/>
<socket>AM2+ </socket>
<tdp_watts>125</tdp_watts>
<process_comments>x¹</process_comments>
<speed_power_ratio>0.00</speed_power_ratio>
</row>

You might recall from previous postings regarding this sample data, that the data contains a superscript numeral character 'superscript one' (¹) having utf8 code 00B9 (sometimes appearing in code analysis output as \xc2\xb9).

In particular just glancing at the rows we just output I can see that row47 and row50 both contain superscript one in some of their fields.

The next post will attempt to deal with getting xml data into postgres from a file.

Monday, August 3, 2009

norge press in your native lang - rss in reader

The inspiration for this posting came from a short messsage (tweet) in which I pointed out a useful facility of google reader and this led to an enquiry.

Rather than answer the enquiry via a series of short messages, I thought a blog posting including some screenshots might be a more useful way.

I needed a practical example and that is where the 'Norge' in the title comes in. I take an interest in Norwegian news and thought this would be a good illustration of rss and the reader product from google.

From now on I will just refer to the software from reader.google.com as 'Reader' for brevity.

Here is a screenshot of what you see (on the left) when Reader is set up and includes a few feeds:


That screenshot has been cropped and the full width shot shows the actual feed entries in a list.

For the right hand pane, selected 'Expanded' rather than 'List' gives you:

*** At this point if you are already familiar with rss and feeds and the symbols for them and/or use reader software already, then you probably want to skip read the rest of this article. I am deliberately avoiding the point and click shortcuts available for adding rss and aiming this article at being instructive without being the quickest way to get to the goal. ***

The most important button for our use is at the top of the first screenshot and is labelled 'Add a subscription'.

Clicking 'Add a subscription' will bring up a small dialogue box where you should paste in the url of the rss feed.

The highlighted phrases in the previous sentence have links to Wikipedia should you require some explanation of the new terminology I just introduced.

Here are some example urls that are valid feed urls (as of time of writing):
  1. http://dictionary.reference.com/wordoftheday/wotd.rss
  2. http://gnubyexample.blogspot.com/feeds/posts/default
  3. http://googleblog.blogspot.com/atom.xml
  4. http://www.timesonline.co.uk/rssfeed/rssIndexFeed/0,,2,00.xml
Here are some unlikely urls for feeds:
  • http://www.google.com/index.html
  • http://www.w3.org/WAI/
  • http://www.gridstatus.com/
What is different about the first bunch of urls and the second bunch of urls? Well in an ideal world all rss feeds would end with .rss
However that is not the situation, yet there are some clues.
Dealing with the numbered entries (1) ends with .rss which is nice (2) has the word feeds in the path (3) ends with .xml (4) ends with .xml and mentions rss and feeds in the path.

You understand that none of this is an absolute guarantee, however it may be some help to anyone who has not looked at urls in close detail before.

The bunch of 3 urls I marked as unlikely do not mention rss or feed in the path and in particular the first ends index.html.
( The default page for a site is normally default.html or index.html and it would be a bit silly of the website owner to place an rss feed at such a url. )

Now back to the Norwegian news part of this story. I have visited Norway and enjoyed my stay in Oslo and Bergen. Since then I occasionally read up on what is going on in the news.

One of the newspapers with an online presence is Aftenposten and they used to provide a section 'News in English' which I headed for recently. It seems that this editorial is no longer funded so, disappointed, I had a hunt around for an alternative paper and came across The Norway Post which looked to fit the bill.

Now, I do not live in Norway or read the printed copies of Norwegian newspapers. Furthermore I do not really wish to compare the editorial content of these two sites, nor endorse either, as that is not the purpose of this article.

Speaking purely hypothetically, it may be that having digested all the articles at norwaypost.no, I am hungry for more news from that country and want Aftenposten.

Referring back to the title of this article, I now explain a useful feature of Reader which is 'automatic translation'. Google has provided such a thing for web pages for a while. I use it occassionally but it requires some cutting and pasting perhaps. Not so in reader.google.com.

Essentially once the feed url has been entered, there is a nice option 'Translate into my language'.

A feed marked Nyheter (News) is what I am looking for and in firefox 'Tools/Page Info/Feeds' shows me a promising url which I repeat below:
  http://www.aftenposten.no/eksport/rss-1_0/?seksjon=ece_frontpage&utvalg=siste
Pasting this into 'Add a subscription' in Reader brought up a list of news items (not in my language):

...and quickly flicking 'Translate into my language' option of Feed Settings gives the following:

Now I do not claim that google translator is fantastic, in fact in some cases the translation can be way off. However it being free, and a very convenient option of Reader, I will use it without putting too much store by the accuracy of it's results.

Based on the above sample of 3 news items I would have quickly read the first article summary and discarded the other two as suspect translations.

If this posting has got you interested in reader and you want to explore some of it's more advanced facilities then join the group dedicated to reader to swap hints and tips.

Saturday, August 1, 2009

postgres - xml out (part 2 of 3) - query_to_xml

Having shown previously several ways of getting data out of postgres in xml form, I thought there might be some benefit in looking at xmlforest, xmlattributes, and xmlelement.

The compact form for row data produced by 'quick report' of pgAdmin is a good candidate for a manual reconstruction, so i begin there.

First row of the sample data in compact form:
<row id="r1" number="1" c1="X2 II " c2="550 Black Edition " c3="3.1"
c4="2x512k " c5="6MB " c6="2" c7="1.15-1.425 " c8="AM3 " c9="80"
c10="45nm Callisto Q3-2009 " c11="38.75" />
could be generated by xmlelement and xmlattributes as follows:

drop sequence row_nbr;create temporary sequence row_nbr;
select xmlelement (name row, xmlattributes (
'r' || nextval('row_nbr') as id,
currval('row_nbr') as number,
model_family as c1, model as c2,
clock_speed as c3, l2cache as c4,
l3cache as c5, ht_bus_ghz as c6,
voltage as c7, socket as c8,
tdp_watts as c9, process_comments as c10,
speed_power_ratio as c11
)) from amd_bang_per_watt limit 1;


Now it seems silly to write a query where we know all the column names then replace them with c1,c2,c3. However this is just an exercise in reconstruction, so as to illustrate xmlelement and xmlattributes.

Now I will illustrate how to obtain the same row, but this time, using less of the attributes and treating every field as a node to obtain:
<row>
<model_family>X2 II </model_family>
<model>550 Black Edition </model>
<clock_speed>3.1</clock_speed>
<l2cache>2x512k </l2cache>
<l3cache>6MB </l3cache>
<ht_bus_ghz>2</ht_bus_ghz>
<voltage>1.15-1.425 </voltage>
<socket>AM3 </socket>
<tdp_watts>80</tdp_watts>
<process_comments>45nm Callisto Q3-2009 </process_comments>
<speed_power_ratio>38.75</speed_power_ratio>
</row>

...we simply run this query which uses xmlforest:
select xmlelement(name row, xmlforest(
model_family, model, clock_speed, l2cache, l3cache,
ht_bus_ghz, voltage, socket, tdp_watts,
process_comments,
speed_power_ratio
)) from amd_bang_per_watt limit 1;
with a bit of sed postprocessing:
sed 's/></>\n</g' <> amd_bang_per_watt-xmlforested.txt
or if you want to avoid the postprocessing just use the handy postgres query_to_xml() function show here executing in psql:

amd_power_dissipation=# select query_to_xml('select * from amd_bang_per_watt
limit 1',true,true,'');
query_to_xml
---------------------------------------------------------------
<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<model_family>X2 II </model_family>
<model>550 Black Edition </model>
<clock_speed>3.1</clock_speed>
<l2cache>2x512k </l2cache>
<l3cache>6MB </l3cache>
<ht_bus_ghz>2</ht_bus_ghz>
<voltage>1.15-1.425 </voltage>
<socket>AM3 </socket>
<tdp_watts>80</tdp_watts>
<process_comments>45nm Callisto Q3-2009 </process_comments>
<speed_power_ratio>38.75</speed_power_ratio>
</row>

Mysql --xml option produces row data in a 'node per field' form discussed previously. However in that output, the field name is an attribute, making it a more tricky task than our examples above.

If you are an XSL or XPath expert then you may well leap straight to more elegant solutions than what I have given here.
I have aimed this posting at postgres users who are not xml experts, however comments showing xsl or xpath ways are very welcome as additional insight.


The 'postgres xml out' postings continue with the next post dealing with more ways to get xml out of postgres.

In this post (sed command) I made use of the bash continuation character pair \newline as described at gnu.org

Thursday, July 30, 2009

postgres - xml in xml out (part 1 of 2)

I will be using familiar sample data and in fact will begin with the assumption that the data is already in postgres.

Hark back to the 'sample data' postings and use COPY amd_bang_per_watt FROM ... if you are coming to this fresh.

I will deal with 'xml out' first as it is a quick win for an already populated table (my situation)

Previous postings have been quite console - this one will be a bit more gui so as to allow us to explore features of the pgAdmin graphical tool.

There are several ways including:

  1. Using psql with the html flag and then munging the html into xml
  2. pgAdmin's table view, clicking 'context menu', and outputting with xml radio button
  3. pgAdmin 'Tools/Query Tool' then 'File/Quick report' (xml radio button selected)
I will ignore the first option (psql) and go for the pgAdmin methods, illustrating with some screenshots.

(Item 3 gave me the best results so feel free to skip ahead if you like)

My pgAdmin screenshots are wide and will not resize well for newspaper style columns. Instead I have put them on picasaweb with links in the text.

2. Tools/Query Tool then File/Quick report:
Selecting 'Reports' from the context menu above gives you six options:
  1. Properties report (an extract is shown below)
  2. DDL report (ddl for table recreation from a script)
  3. Data Dictionary report (an accurate description of what you get)
  4. Statistics Report (Sequential scans count is expectedly high in here)
  5. Dependencies Report (for our table this is empty in an xml style of emptiness)
  6. Dependents Report (whoops - incorrect spellcheck complaint in firefox)
(i) Properties report gives you xml - extract below:
<table>
<columns>
<column id="c1" number="1" name="Property" />
<column id="c2" number="2" name="Value" />
</columns>
<rows>
<row id="r1" number="1" c1="Name" c2="amd_bang_per_watt" />
<row id="r2" number="2" c1="OID" c2="16386" />
<row id="r3" number="3" c1="Owner" c2="postgres" />
<row id="r4" number="4" c1="Tablespace" c2="pg_default" />
<row id="r5" number="5" c1="ACL" c2="{postgres=arwdxt/postgres,someuser=arwd/postgres}" />
<row id="r6" number="6" c1="Primary key" c2="" />
<row id="r7" number="7" c1="Rows (estimated)" c2="50" />
<row id="r8" number="8" c1="Fill factor" c2="" />
<row id="r9" number="9" c1="Rows (counted)" c2="50" />
<row id="r10" number="10" c1="Inherits tables" c2="No" />
<row id="r11" number="11" c1="Inherited tables count" c2="0" />
<row id="r12" number="12" c1="Has OIDs?" c2="No" />
<row id="r13" number="13" c1="System table?" c2="No" />
<row id="r14" number="14" c1="Comment" c2="" />
</rows>
</table>

Click the links in the original six point list for googledocs of all the xml files in full.

What I did not find in any of the six reports was the actual row data from the table - will come to that soon.

3. Tools/Query Tool then File/Quick report:

My pgAdmin screenshots are wide and will not resize well. Instead I have put them on picasaweb with links listed below:

So to get the actual row data my approach was to enter the Query Tool then write a 'select * from ...' type of query and Quick Report the results.

Having selected XML in Quick Report you will obtain the actual row data from the table.

Does this row data look the same as mysql --xml style output?
Not exactly. What happens in pgAdmin xml output is that columns are not treated as individual nodes but instead are attributes of a row node.

Sample first row from the full xml output:
<row id="r1" number="1" c1="X2 II " c2="550 Black Edition " c3="3.1"
c4="2x512k " c5="6MB " c6="2" c7="1.15-1.425 " c8="AM3 " c9="80"
c10="45nm Callisto Q3-2009 " c11="38.75" />

Is this a valid way of representing xml? Certainly.
Will it lead to parsing issues? - No. Any parser worth its salt handles nodes and attributes and exposes them in a useful way to the programmer.

Perhaps you might later use python-libxml2 or similar to construct an insert statement of the form INSERT INTO amd_bang_per_watt VALUES(_,_, from the column entries c1,c2,etc.

This 'more compact' format should save you space and may be doing you a favour if you harbour any thoughts of parsing the rows in jQuery or PHP.

One thing I should point out is that this compact form comes at a price. Should you later insert or remove columns from the table, then the column numbers c1, c2, c3, etc are going to be out of step with your altered table.

The facilities for xml output in your database can be useful, but are not always the quickest route to goal. If you data is going out, then into another postgres database then you would likely want to use the backup option in pgAdmin.

Example of such backup output are shown below:
I found the second backup format a refreshing alternative to some of the issues that can otherwise creep in to comma separated data.

xml in xml out (part 2 of 3) is the next post and will continue this theme.

Thursday, July 23, 2009

ctas for mysql and postgres - create table as select ...

'Create table as select...' is a useful way of defining a new table based on some stuff already in your database.

It might be that you have just started work on an existing system and your fresh eyes can see a table that can be normalised, or perhaps you are told that some of the tables are bloated with obselete columns.

You might use 'create table as select' (CTAS) in the same way you might rework a file on your filesystem by going through copy, edit, move (where the final move replaces the original)
[ So what you did on the way was create a temporary copy and lastly overwriting the original ]

Completing the analogy - in database terms CTAS might create your temporary table and you might look at how to change the schema, so as to ignore the original copy and use your CTAS created table in its place.

Temporary tables for whatever purpose can be created using CTAS or the lesser preferred 'select into'. From postgres documentation: CREATE TABLE AS is the recommended syntax.




The relevant portion of the mysql syntax is
select_statement:[IGNORE | REPLACE] [AS] SELECT ...   (Some legal select statement)
which is not so easy to find amongst the complex syntactical explanation on the main documentation page.

For perl types who have csv data on the filesystem and want to use CTAS look at DBD::CSV module.

And now for some simple examples using mysql and the sample data from the recent postings:

My suggestion is to just do a standalone select first:

select model,speed_power_ratio,process_comments from amd_bang_per_watt where model not like '%¹%' and process_comments not like '%¹%' order by speed_power_ratio;

which gives this output (abbreviated)

+---------------------+-------------------+---------------------------------------------+
| model | speed_power_ratio | process_comments |
+---------------------+-------------------+---------------------------------------------+
| 550 Black Edition | 38.75 | 45nm Callisto Q3-2009 |
...
| 9950 | 18.57 | 65nm B3 |
+---------------------+-------------------+---------------------------------------------+
34 rows in set (0.00 sec)

By doing such a check that you get what expected, you can avoid
messing around dropping and recreating your target table.
My target table will be named amd_bang_per_watt_sumary and here is the create table as select (CTAS):

mysql> create table amd_bang_per_watt_summary as select model,speed_power_ratio,process_comments from \
-> amd_bang_per_watt where model not like '%¹%' and process_comments not like '%¹%' order by \
-> speed_power_ratio desc;
Query OK, 34 rows affected (0.01 sec)
Records: 34 Duplicates: 0 Warnings: 0

In order to form the sql statement above I needed to do a bit of UTF8 looking up so that the 'superscript one' could be included correctly in the like bit of the statement.

How I pasted 'superscript one' onto the command line is to use charmap application and navigate to 'Latin-1 supplement' where I can see 'superscript one' (0xB9) amongst the superscript numerals.

mysql> select count(*),substr(process_comments,1,4) as nm from amd_bang_per_watt_summary group by nm;
+----------+------+
| count(*) | nm |
+----------+------+
| 15 | 45nm |
| 19 | 65nm |
+----------+------+
2 rows in set (0.00 sec)

mysql> select * from amd_bang_per_watt_summary where speed_power_ratio > 27;
+--------------------+-------------------+---------------------------------------------+
| model | speed_power_ratio | process_comments |
+--------------------+-------------------+---------------------------------------------+
| 550 Black Edition | 38.75 | 45nm Callisto Q3-2009 |
| 705e | 38.46 | 45nm Heka Q3-2009 |
| 905e | 38.46 | 45nm Deneb Q3-2009 |
| 545 | 37.50 | 45nm Callisto Q3-2009 |
| 900e | 36.92 | 45nm Deneb 2009 |
| 945 | 31.58 | 45nm Deneb (part no HDX945WFK4DGI) Q3-2009 |
| 720 Black Edition | 29.47 | 45nm Heka Q1-2009 |
| 9100e | 27.69 | 65nm B2 |
| 9150e | 27.69 | 65nm B3 |
| 910 | 27.37 | 45nm Deneb Q1-2009 |
| 710 | 27.37 | 45nm Heka Q1-2009 |
| 810 | 27.37 | 45nm Deneb Q1-2009 |
+--------------------+-------------------+---------------------------------------------+
12 rows in set (0.00 sec)

The warning count reported by the CTAS is the only thing I might worry about but being zero is good and the quick queries above demonstrated to me that all was well.

The mysql output above is fairly readable but a truer representation of what appeared on my display is in this publicly accessible googledoc.

The superscript one like '%%' bit of the sql might need a bit of explaining - The original wikipedia data
included some processors for which there was no official release date. My target table amd_bang_per_watt_summary will not contain those entries.

Hopefully this posting will save you from having to trawl the documentation for CTAS and saving you the experience of:
  • mysql - the page is so long and complicated that the [AS] bit is easy to miss
  • postgres - whilst CTAS is given a page of its own, there is no link from regular 'create table' so I personally thought it difficult to find.
  • both - The acronym CTAS whilst unofficial might be included as a tag purely for user convenience.

Tuesday, July 21, 2009

Sample data for import and xml export - mysql and postgres (part 3 of 3)

Following up on previous postings I begin by giving the full xml output which should be pastable to your computer and then imported to any database with an xml import facility.

amd_bang_per_watt__20090720.xml
amd_bang_per_watt__20090720.xml-mit-create

or just a couple of rows as appeared in the previous posting:
amd_bang_per_watt__20090720.xml-select

For two of the main open source databases I use, some graphical tools are listed here:
mysql-admin documentation available in English, Greek, Italian, Portuguese, Russian, Spanish, Ukrainian.

If you like a screenshotted introduction to the gui tools then here is a quick idea of the look and feel:



The version 6 release of mysql (maybe 2010) will include
load xml infile
will be a step forward. Version 6 will follow version 5.4 of mysql which is in testing as I write.


Commercial releases of mysql might show slightly different versions but in keeping with the title of this blog I tend to focus on gpl or free and open source flavours of software.


If you have the data in a variety of formats (usually the case if it is your data) then csv format and the formats mysqlimport has supported for the last 10 years should do fine to get the job done.


All major databases will likely soon offer xml import (if not already) which should help data interchange between retailers and suppliers providing xml standards are strictly adhered to. Newer xml proposed standards have public pages for further research at w3.org


If you are itching for bringing in xml data directly to mysql today then you might want to browse the following:


And now back to the data which I give in mysql own .sql format to aid anyone wanting to just load it directly rather than getting hung up on xml.
So at whatever point you have joined this posting thread you can quickly pick up from here if you are interested in what comes next which is all Postgres.

Before leaving mysql entirely we should export the data using the --compatible flag, so as the sample data and postgres postings can discuss what (if anything) must be done to the sql output from --compatible to make it totally postgres.

Here is what mysql produces from --compatible=postgresql:
Mark Slade had this to say regarding what to expect in his article about converting databases.
Despite how it sounds, this won't generate PostgreSQL compatible code, but it will get you halfway there.
Halfway sounds pretty good if you do not want to go down the 'csv out csv in' route. I explore both ways in the postgres postings.

Monday, July 20, 2009

Sample data for import and xml export - mysql and postgres (part 2a of 3) - amd_bang_per_watt-callisto

<?xml version="1.0"?>

<resultset statement="select * from amd_bang_per_watt where process_comments like '%allisto%'" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<row>
<field name="model_family">X2 II </field>
<field name="model">550 Black Edition </field>
<field name="clock_speed">3.1</field>
<field name="l2cache">2x512k </field>
<field name="l3cache">6MB </field>
<field name="ht_bus_ghz">2</field>
<field name="voltage">1.15-1.425 </field>
<field name="socket">AM3 </field>
<field name="tdp_watts">80</field>
<field name="process_comments">45nm Callisto Q3-2009 </field>
<field name="speed_power_ratio">38.75</field>
</row>

<row>
<field name="model_family">X2 II </field>
<field name="model">545</field>
<field name="clock_speed">3</field>
<field name="l2cache">2x512k </field>
<field name="l3cache">6MB </field>
<field name="ht_bus_ghz">2</field>
<field name="voltage">1.15-1.425 </field>
<field name="socket">AM3 </field>
<field name="tdp_watts">80</field>
<field name="process_comments">45nm Callisto Q3-2009 </field>
<field name="speed_power_ratio">37.50</field>
</row>
</resultset>

Sample data for import and xml export - mysql and postgres (part 2 of 3)

Designing tables can be a hit and miss affair but looking at the warnings mysql has thrown up it does not look too far off:
+---------+------+----------------------------------------------------------------------+
| Level   | Code | Message                                                              |
+---------+------+----------------------------------------------------------------------+
| Note    | 1265 | Data truncated for column 'voltage' at row 1                         |
| Warning | 1265 | Data truncated for column 'process_comments' at row 1                |
| Note    | 1265 | Data truncated for column 'voltage' at row 2                         |
| Warning | 1265 | Data truncated for column 'process_comments' at row 2                |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 3  |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 4  |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 5  |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 6  |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 7  |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 8  |
| Warning | 1265 | Data truncated for column 'process_comments' at row 9                |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 9  |
| Note    | 1265 | Data truncated for column 'voltage' at row 10                        |
| Note    | 1265 | Data truncated for column 'voltage' at row 11                        |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 13 |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 14 |
| Warning | 1265 | Data truncated for column 'process_comments' at row 16               |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 16 |
| Warning | 1265 | Data truncated for column 'process_comments' at row 18               |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 18 |
| Note    | 1265 | Data truncated for column 'voltage' at row 23                        |
| Note    | 1265 | Data truncated for column 'voltage' at row 24                        |
| Note    | 1265 | Data truncated for column 'voltage' at row 25                        |
| Note    | 1265 | Data truncated for column 'voltage' at row 26                        |
| Note    | 1265 | Data truncated for column 'voltage' at row 27                        |
| Note    | 1265 | Data truncated for column 'voltage' at row 28                        |
| Note    | 1265 | Data truncated for column 'voltage' at row 29                        |
| Note    | 1265 | Data truncated for column 'voltage' at row 30                        |
| Warning | 1265 | Data truncated for column 'process_comments' at row 30               |
| Note    | 1265 | Data truncated for column 'voltage' at row 31                        |
| Warning | 1265 | Data truncated for column 'process_comments' at row 31               |
| Note    | 1265 | Data truncated for column 'voltage' at row 32                        |
| Note    | 1265 | Data truncated for column 'voltage' at row 33                        |
| Warning | 1265 | Data truncated for column 'process_comments' at row 33               |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 33 |
| Warning | 1265 | Data truncated for column 'process_comments' at row 36               |
| Warning | 1265 | Data truncated for column 'process_comments' at row 37               |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 37 |
| Note    | 1265 | Data truncated for column 'l2cache' at row 47                        |
| Warning | 1265 | Data truncated for column 'voltage' at row 47                        |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 48 |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 49 |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 50 |
+---------+------+----------------------------------------------------------------------+

Although the above looks a lot, it is really caused by just one or two issues which I will now address. I will provide more details on 'show warnings;' command that produced the above a bit later in this post.

Design alterations:
  • voltage to VARCHAR(15)
  • speed_power_ratio to get a default of '0.00' and might aswell make it not null in that case
  • l2cache and l3cache to VARCHAR(15)
  • process_comments to VARCHAR(50)
DROP TABLE IF EXISTS amd_bang_per_watt;
SET character_set_client = utf8;
CREATE TABLE amd_bang_per_watt (
model_family  VARCHAR(10) NOT NULL,
model   VARCHAR(30) NOT NULL default 'unknown',
clock_speed  VARCHAR(20) NOT NULL,
l2cache   VARCHAR(15),
l3cache   VARCHAR(15),
ht_bus_ghz  VARCHAR(10),
voltage   VARCHAR(15),
socket   VARCHAR(10),
tdp_watts  VARCHAR(20),
process_comments VARCHAR(50),
speed_power_ratio DECIMAL(5,2) NOT NULL default '0.00' COMMENT 'bang per watt'
) ENGINE=MyISAM DEFAULT CHARSET=utf8 MIN_ROWS=50 CHECKSUM=1 COMMENT='Source: Wikipedia 200907
http://en.wikipedia.org/wiki/List_of_CPU_power_dissipation#AMD_Phenom_.28Phenom_64.29';

And here I will show the way of importing the data so as to be able to issue show warnings to get some feedback on how the import went okay.
root@d183z2g:~# mysql amd_power_dissipation -u root -p
Enter password:                                   
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 66                       
Server version: 5.0.75-0ubuntu10.2-log (Ubuntu)      

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> LOAD DATA LOCAL INFILE '/tmp/amd_bang_per_watt.csv' INTO TABLE amd_bang_per_watt
FIELDS  TERMINATED BY ','  OPTIONALLY ENCLOSED BY '"'
LINES TERMINATEDBY '\n' IGNORE 1 LINES;
Query OK, 50 rows affected, 16 warnings (0.00 sec)
Records: 50  Deleted: 0  Skipped: 0  Warnings: 0

mysql> show warnings;
+---------+------+----------------------------------------------------------------------+
| Level   | Code | Message                                                              |
+---------+------+----------------------------------------------------------------------+
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 3  |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 4  |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 5  |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 6  |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 7  |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 8  |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 9  |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 13 |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 14 |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 16 |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 18 |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 33 |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 37 |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 48 |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 49 |
| Warning | 1366 | Incorrect decimal value: '' for column 'speed_power_ratio' at row 50 |
+---------+------+----------------------------------------------------------------------+
16 rows in set (0.00 sec)

mysql> select model,speed_power_ratio from amd_bang_per_watt limit 5;
+--------------------+-------------------+
| model              | speed_power_ratio |
+--------------------+-------------------+
| 550 Black Edition  |             38.75 |
| 545                |             37.50 |
| 6??? x¹           |              0.00 |
| 6??? x¹           |              0.00 |
| 6??? x¹           |              0.00 |
+--------------------+-------------------+
5 rows in set (0.00 sec)
Note: The LOAD DATA LOCAL INFILE command in the session above should be all on one line but I split it here to fit the blog posting.

In Summary: The only warnings relate to missing values for speed_power_ratio in the input .csv file and those will be defaulted to 0.00 by means of the table definition.

Now the point of all this importing was to demonstrate xml output from mysql.

Mysql makes it real simple as the following example illustrates (the xml output is in the next post labelled part 2a of 3):
useri@d183z2g:~$ mysql --xml amd_power_dissipation -u root -p -e "select * from amd_bang_per_watt where process_comments like '%allisto%';"
Enter password: