Monday, April 21, 2008

Why non-techinal people want to talk technical

I find it interesting that when conversation begins between individuals of varying backgrounds of expertise, the domain jargon in each case seems to impress the individual who is not experienced in that area, while at the same time confusing as well. Why would anyone respect complex and isolated knowledge that does not translate given domain terminology?

The sad truth is, there is no glory or praise deserved when speaking with an individual who is not familiar with your expertise, and then forcing them to understand the concepts you understand with the same verbiage, terminology and language that you have come accustom too.

In my own experience, technology often gets the "cool" or "nerd" appeal. Businesses all have a technology department. They need technology to make their lives easier, to make it easier to conduct business. Business in most cases is not the technology, and so shouldn't be the languages used to communicate solutions, or to describe the business. To non-technical people, those that can do technical things must be smart, and are for some reason, immediately respected. Through this, I can understand why a non-technical person would want to talk the talk: to feel they've gained some knowledge and are more intelligent about the matter.

A technical individual should not burden an outsider with their terminology and language. This is acceptable within the field, but not beyond it. The goal is to understand each other so a solution can likely be built to make the non-technical person's life easier through technology.

My suggestion to technical folks is this... Stop trying to feel better about yourself by confusing your client via terminology and language. Feel better about yourself by delivering a solution that is understood and is adapted to your client's needs as a person. Remember, the rest of the human race as a basic need, to be better humans. The point of technology is to make people be better people, or a business a better business.

Aren't we better off when we understand each other? I think so.

My First Perl Program, Hello Bioinformatics

"Write a subroutine that takes as arguments an amino acid; a position 1, 2 or 3; and a nucleotide. It then takes each codon that encodes that specified amino acid (there may be from one to size such codons), and mutates it a the specified position to the specified nucleotide. Finally it returns the set of amino acids that are encoded by the mutated codons."

The following is my answer. "amino_to_codon.txt" stores the following:

Isoleucine ATT, ATC, ATA
Leucine CTT, CTC, CTA, CTG, TTA, TTG
Valine GTT, GTC, GTA, GTG
Phenylalanine TTT, TTC
Methionine ATG
Cysteine TGT, TGC
Alanine GCT, GCC, GCA, GCG
Glycine GGT, GGC, GGA, GGG
Proline CCT, CCC, CCA, CCG
Threonine ACT, ACC, ACA, ACG
Serine TCT, TCC, TCA, TCG, AGT, AGC
Tyrosine TAT, TAC
Tryptophan TGG
Glutamine CAA, CAG
Asparagine AAT, AAC
Histidine CAT, CAC
Glutamic acid GAA, GAG
Aspartic acid GAT, GAC
Lysine AAA, AAG
Arginine CGT, CGC, CGA, CGG, AGA, AGG




#!/usr/bin/perl
# Terrence Pietrondi
# CIS667 Programming Assignment 1
use strict;
use warnings;

# map with codons as keys and amino acids as values
our %codon_to_amino_acid_map = ();
# map with amino acids as keys and codons as values
our %amino_acid_to_codon_map = ();

# load the command line parameters for this program
# no arguments expected
# returns amino acid, position and nucleotide
sub load_command_line{
# initalize the amino acid command line variable
my $amino_acid;
# initalize the position command line variable
my $position;
# initalized the nucleotide variable
my $nucleotide;
# the argument count
my $command_arguments = $#ARGV + 1;
# check for three command arguments
if($command_arguments == 3){
# assign amino acid
$amino_acid = $ARGV[0];
# assign position
$position = int($ARGV[1]) - 1;
# check the position
if($position < 0 or $position > 2){
print "Position must be either: 1,2, or 3!";
exit 1;
}

# assign the nucleotide
$nucleotide = $ARGV[2];
# return the amino acid, position and nucleotide
return $amino_acid,$position,$nucleotide;
} else {
# if we don't get three arguments

# show some help messages
print "Given $command_arguments command arguements, I need three!\n";
print " \n";
print "In the mean time, I'll load the driver....\n\n";
# set all the values as blank strings
$amino_acid = '';
$position = '';
$nucleotide = '';
# return the blank string
return $amino_acid,$position,$nucleotide;
}
}

# trim trailing and leading whitespace from a string
# expects a single string arguement
# returns the new string
sub trim_string{
my ($trim_string) = @_;
$trim_string =~ s/^\s+//;
$trim_string =~ s/\s+$//;
return $trim_string;
}

# load the amino acid & codon data from a file
# no arguments
# returns the raw data lines of the file as an array
sub load_data{
# load the codon map to aminos from a file
my $amino_to_codon_txt="amino_to_codon.txt";
open(DAT, $amino_to_codon_txt) || die("Could not open file: $amino_to_codon_txt!");
my @raw_data=;
close(DAT);

return @raw_data;
}

# load the mappings in the data files into the data maps of this
# program
# expects one arguement, the raw data array of lines from the data file
# returns nothing, loads global data maps
sub load_maps{
my (@raw_data) = @_;
my $line;
foreach $line (@raw_data){
# clean the line terminator
chop($line);
my $loop_amino_acid;
my $loop_codon;
# split the line on the tab charater to seperate the amino acids from its codons
($loop_amino_acid,$loop_codon) = split(/\t/,$line);
# trim the codon string
$loop_codon = &trim_string($loop_codon);
# trim the amino acid string
$loop_amino_acid = &trim_string($loop_amino_acid);
# set the codons for the given amino acid in the data
$amino_acid_to_codon_map{$loop_amino_acid} = $loop_codon;
# split the codons on the comma
my @codon_array = split(/,/, $loop_codon);
my $loop_codon_inner;
# for each codon for this amino acid
foreach $loop_codon_inner (@codon_array){
# trim the codon
$loop_codon_inner = &trim_string($loop_codon_inner);
# set the amino acid for this codon in the map
$codon_to_amino_acid_map{$loop_codon_inner} = $loop_amino_acid;
}
}
}

# run the mutation steps
# expects three arguments, the amino acid, the position and the nucleotide
# returns an array of amino acids encoded by this mutation
sub mutate{
my($amino_acid,$position,$nucleotide) = @_;
my $three = 3;
# the length of the nucleotide
my $nucleotide_length = length $nucleotide;
# the first three characters of the nucleotide
my $loop_codon = substr $nucleotide,$position,$three;
# the codons for this amino acid
my $codon_string = $amino_acid_to_codon_map{$amino_acid};

# some help messages or progress
print "Your amino acid: '$amino_acid'\n";
print "Its codons: $codon_string\n";
print "Will mutate at position: $position\n";
print "On nucleotide: $nucleotide\n";

my %encoded_amino_acids = ();
# while the loop codon is not blank
while ($loop_codon ne '') {
# check that the position work will not be out of range
if($position + 3 <= $nucleotide_length){
my $loop_codon_inner;
# split the codon string on the comma
my @codon_array = split(/,/, $amino_acid_to_codon_map{$amino_acid});
foreach $loop_codon_inner (@codon_array){
# trim the codon
$loop_codon_inner = &trim_string($loop_codon_inner);
# compare the left side of the mutation to the right side
my($left,$right) = &compare_codons($loop_codon,$loop_codon_inner);
# get the acid that maps to this mutation
my $acid = $codon_to_amino_acid_map{$loop_codon};
# add the acid to the return array
$encoded_amino_acids{$acid} = $loop_codon;
# show the difference
print "$left ==> $right\n";

}
# move the position
$position = $position + 3;
# get the next loop codon
$loop_codon = substr $nucleotide,$position,3;

} else {
# if the position changes are out of range, set the break condition
$loop_codon = '';
}
}
# return the kets of the of amino acid map, we just want the amino acids encoded by
# this mutation. we use the map to model a set
return keys %encoded_amino_acids;

}

# compare two codons
# expects two arguments, each a codon
sub compare_codons{
my($left_codon,$right_codon) = @_;

my $new_left;
my $new_right;

my $three = 3;
my $index = 0;
# compare each position of the codon
while($index < $three){
my $left_char = substr $left_codon,$index,1;
my $right_char = substr $right_codon,$index,1;

# if the positions match, do nothing
if($left_char eq $right_char){
$new_left .= $left_char;
$new_right .= $right_char;
} else {
# if the positions do not change, show a base change
$new_left .= "($left_char)";
$new_right .= "($right_char)";
}
# increment the index
$index++;
}
#return the mutation presentation string
return($new_left,$new_right);

}

# print the array of acids
# expects on argument, an array of acids
# returns nothing
sub print_acids{
my @acids = @_;
my $loop_acid;
foreach $loop_acid (@acids){
print "$loop_acid\n";
}

print "\n";
}

# the command line driver
# expects three arguments, the acid, the position and the nucleotide
# returns nothing
sub command_driver{
my($amino_acid,$position,$nucleotide) = @_;
# load the data
my @raw_data = &load_data();
# load the data maps with the data from the file
&load_maps(@raw_data);
# perform the mutations
my @acids = &mutate($amino_acid,$position,$nucleotide);
# show what acids are encoded by the mutations
&print_acids(@acids);
}

# hard coded driver, just runs the command line driver with pre-defined variables
# expects no arguments
# returns nothing
sub hard_coded_driver{
my $amino_acid = 'Valine';
my $position = 1;
my $nucleotide = 'GGGAAACCC';

&command_driver($amino_acid,$position,$nucleotide);

$amino_acid = 'Valine';
$position = 3;
$nucleotide = 'GGGAAACCC';

&command_driver($amino_acid,$position,$nucleotide);

$amino_acid = 'Serine';
$position = 2;
$nucleotide = 'GGGAAACCC';

&command_driver($amino_acid,$position,$nucleotide);
}

# load the command line
my($amino_acid,$position,$nucleotide) = &load_command_line();
# if the variables are blank, then there were no arguments given, and so we will run the
# hard coded driver
if($amino_acid eq '' and $position eq '' and $nucleotide eq ''){
&hard_coded_driver();
} else {
# otherwise, we will run on the given arguments
&command_driver($amino_acid,$position,$nucleotide);
}

Monday, April 14, 2008

Why can't the open source software development model help the financial sector?

Why can't the open source software development model help the financial sector? The banks, the lenders, the market, or whatever you want to call it sucks. Layoffs, for sale signs, off shoring, foreclosure, etc, are happening everywhere. In terms of technology, why aren't these companies banding together to help each other to save operating cost?

Sure, sure, sure, there are plenty of proprietary activities going on, but think of all the common activities. Account opening, loan servicing, online banking, these are all common functions of a financial institution. The technology that drives these creates the status quo amongst peers in this industry. How this technology is used and implemented creates the edge, not the actual code, the core functional code. I say, open it up, share it. Standardize the financial platform and raise the bar from technology centric edges to the people and the how.

Now the projects are open to developers across these institutions, across the world, across timezones, across languages. Project funding is now modeled after open source development, donations, grants, government. Need computing power, resources? The major companies will soon be opening their environments to share their unused resource for a price, the financial grid now is born, and even the small guys can contribute as well.

But nobody wants to hear it, everyone needs their arms and control around their code and servers. Trust me, its not as valuable as it appears. The value is the use, the how, not the what.

Sunday, April 13, 2008

What is "Docsum"?

Once upon a time, I had an idea. There is a lot of documentation out there, meaning web pages, new feeds, research papers, e-mails, etc. All typed in text, all in language, all with meaning, all without structure... mostly. So I was thinking, wouldn't it be nice to have a search that rather then resulting in links to documentation, it presented a summary of the documentation it knew about with references to the originating document. The goal being, just tell me about the important aspect of a topic rather them me having to go out to each result and find the important sections of documentation that are relevant to the topic I am interested in.

From this idea, I started a project on SourceForge called Docsum. I haven't spend much time developing this project in awhile, but I wanted to write a little about it. The process works as so, you add some sort of documentation, for example the demo site uses software licenses, and then enter a topic to work against the added content for form a summary. A topic entered might be "free speech". The result is a document summarizing the content in the Docsum repository pertaining to "free speech".

The current project is limited to only plain text files, and RSS/Atom XML. The generated summary is plain text as well, and is not pretty. But isn't that what ideas are all about... the idea.

One day, I hope to pursue this project more, but at the moment, my time is limited. At the moment, the project is made up of 3 parts, the web view, the command line view, and the datastore. I wanted a good command line interface into the application, and due to the easy of using Python for text processing, all the core functionality is written in Python. The web end is simple PHP pages that interact with the Python core via running subprocesses. The datastore is MySQL and the Python core directly interacts with that piece.

See Also:

When the Del.icio.us tagging suggest didn't work

For some reason or another, I recently had trouble tagging on my del.icio.us account. The problem was, when I was on a page and ready to bookmark it, I'd hit my handy browser tag button to save the page. Usually in the tagging section of the pop-up, when you type, a handy AJAX suggest feature lists out as you type tags that you've used before. Great feature, one I have come unable to live without it seems. Everything fell apart for me when this feature failed to work for me.

For some reason or another, I have trouble creating new tags when know it is not a likely tag that many items will fall into, and therefore, as much as possible, I try to re-use my existing tag set. And so, when I tag, I am not just filling in the values, but at the same time, I am checking to see if the tag I am thinking of at the moment exists in my set and also how many other items are in that specific tag if it exists. When this failed to work, I failed to tag.

I read a heavy amount of feeds per day, or whenever I get a chance to read. For these few days, I didn't know how to save these items? Do I star them in Google Reader, do I just bookmark them in the browser, do I e-mail the pages to myself.... I didn't know what to do.

What I did do, was to go home (since this problem was happening at work) and scrolled back down into the articles marked as read in my Google Reader list, and re-visited the pages to bookmark them. Not sure if it was a network issue at work, or what, but I found some threads in the user group discussion regarding the same issue, but no solution.

In any case, I love using Del.icio.us. If it ever went away... I don't know what I would do... I guess I should backup my bookmarks, but I don't... oh well.

See Also:

Saturday, April 12, 2008

When will I ever use this in real life? (Breadth First Search)

A lot of times, while taking classes, learning, being educated, etc, we ask ourselves, "when will I ever use this in real life?". "This" being some conceptual concept that is not concrete, but abstract.

In terms of computer science and algorithms, using all the good stuff you learned in school in practice might not happen that often, depending on the company you work for. I say this because a lot focus, or at least priority, is on functionality and feature design. When these come first, speed and efficiency come second, as the client expects this with their features and functions, and are not document as a requirement. Speed and efficiency then get rolled up into refactoring then, or base work, or core work, or whatever.

Recently I had an opportunity to implement an actual algorithm learned in school in practice. I had to implement a graph traversing algorithm for an relational analysis application I am working on. The objective of the project is to capture the connected structure of "things" and demonstrate that structure through images. The "things" and their "relationships" are stored in a generic graph structure. And so visualizing this requires a starting point and then show the graph from that starting point.

At first, I just hammered out some crap code to get it to work. The problem was that there was a bug that once a "thing" was shown, it was never visited again. So if A->B is shown when expanding A, B is never expanded.

When I had then chance, I fixed this using the breadth first search. Oh the fun, taking pseudo code to actual code, and making it fit my needs. My needs, being to generate a graphviz dot graph output to render images with a depth limit based on user input, as well as a node limit based on user input. As, my needs were not to search, but to just walk the graph. I didn't want depth first because I wanted to expand each depth as a time, and then move to the next. I figure this out later, after realizing the different of the two, depth verse breadth. All in PHP, with my datastore as a MySQL database (I'll expand on the schema of the database design in another post sometime).

Any way, when I was done, I reflected on how often I get to apply my educational concepts learned in practice. Since I realized it was not too often, for some reason I enjoyed doing so, and am looking forward to any opportunity to do more.

By the way, I went to Kent State for my undergraduate, and am attending Cleveland State for my graduate, both in computer science.

See Also:

Share on Twitter