Coding, is? Fun!

Friday, July 27, 2007

Interactions with American clients - Twenty tips

Dealing with any clients in services is an art. In the case of IT, we primarily deal with American clients (atleast in Photon). It is useful to know how the English language works with them.

1. Do not write "the same" in an email - it makes little sense to them.
Example -
I will try to organize the project artifacts and inform you of the same when it is done

This is somewhat an Indian construct. It is better written simply as:
I will try to organize the project artifacts and inform you when that is done

2. Do not write or say, "I have some doubts on this issue"
The term "Doubt" is used in the sense of doubting someone - we use this term because in Indian languages (such as Tamil), the word for a "doubt" and a "question" is the same.
The correct usage (for clients) is:
I have a few questions on this issue

3. The term "regard" is not used much in American English. They usually do not say "regarding this issue" or "with regard to this".
Simply use, "about this issue".

4. Do not say "Pardon" when you want someone to repeat what they said. The word "Pardon" is unusual for them and is somewhat formal.

5. Americans do not understand most of the Indian accent immediately - I have heard some of them say that they only understand 75% of what we speak and then interpret the rest. Therefore try not to use shortcut terms such as "Can't" or "Don't". Use the expanded "Cannot" or "Do not".

6. Do not use the term "screwed up" liberally. If a situation is not good, it is better to say, "The situation is messed up". Do not use words such as "shucks", or "pissed off".

7. As a general matter of form, Indians interrupt each other constantly in meetings - do NOT interrupt a client when they are speaking. Over the phone, there could be delays - but wait for a short time before responding.

8. When explaining some complex issue, stop occasionally and ask "Does that make sense?". This is preferrable than "Do you understand me?"

9. In email communications, use proper punctuation. To explain something, without breaking your flow, use semicolons, hyphens or paranthesis. As an example:
You have entered a new bug (the popup not showing up) in the defect tracking system; we could not reproduce it - although, a screenshot would help.

Notice that a reference to the actual bug is added in paranthesis so that the sentence flow is not broken. Break a long sentence using such punctuation.

10. In American English, a mail is a posted letter. An email is electronic mail. When you say
"I mailed the information to you"
, it means you sent an actual letter or package through the postal system.
The correct usage is:
"I emailed the information to you"

11. To "prepone" an appointment is an Indian usage. There is no actual word called prepone. You can "advance" an appointment.

12. In the term "N-tier Architecture" or "3-tier Architecture", the word "tier" is NOT pronounced as "Tire". I have seen many people pronounce it this way. The correct pronunciation is "tea-yar". The "ti" is pronounced as "tea".

13. The usages "September End", "Month End", "Day End" are not understood well by Americans. They use these as "End of September", "End of Month" or "End of Day".

14. Americans have weird conventions for time - when they say the time is "Quarter Of One", they mean the time is 1:15. Better to ask them the exact time.

15. Indians commonly use the terms "Today Evening", "Today Night". These are not correct; "Today" means "This Day" where the Day stands for Daytime. Therefore "Today Night" is confusing. The correct usages are: "This Evening", "Tonight".
That applies for "Yesterday Night" and "Yesterday Evening". The correct usages are: "Last Night" and "Last Evening".

16. When Americans want to know the time, it is usual for them to say, "Do you have the time?". Which makes no sense to an indian.

17. There is no word called "Updation". You update somebody. You wait for updates to happen to the database. Avoid saying "Updation".

18. When you talk with someone for the first time, refer to them as they refer to you - in America, the first conversation usually starts by using the first name. Therefore you can use the first name of a client. Do not say "Sir". Do not call women "Madam".

19. It is usual convention in initial emails (particularly technical) to expand abbreviations, this way:
We are planning to use the Java API For Registry (JAXR).

After mentioning the expanded form once, subsequently you can use the abbreviation.

20. Make sure you always have a subject in your emails and that the subject is relevant. Do not use a subject line such as "Hi".

Ok, One of my friends forwarded me an email a couple of days back - in which someone STOLE this blog post; added some formatting; and are now sending it as a forward to everyone.
This is my first exposure to IP theft - and whoever took it should have atleast attributed to me.
Now, they have also added four other tips - and these are completely wrong. Let me list them here:
21. Avoid using "Back" instead of "Back" Use "ago".Back is the worst word for American.(for Days use "Ago",For hours use "before")

[Ram]You CAN say "Back". In fact Americans understand "back" better than "ago". Nobody says "four days ago" in the USA.

22.Avoid using "but" instead of "But" Use "However".

[Ram] It depends on the context. You CAN say "But".

23.Avoid using "Yesterday" hereafter use "Last day".

[Ram]Whoever says "I had dinner Last Day?". This is idiotic. You CAN say "Yesterday". It is the right thing to say.

24.Avoid using "Tomorrow" hereafter use "Next day".

[Ram]Again, whoever says "I will meet you Next Day?". You CAN say "Tomorrow".

Whoever took my post and added the last four tips is clueless.

Thursday, July 19, 2007

Application Framework - Project Breakdown

We have to break down the project into different modules. Since these are dependent on each other, it is always a challenge to manage this:
1. Start with an Events namespace - to wrap asynchronous communication. This may need some thread management. It may help break it down into a set of Services and Tasks. A single task will wrap a thread. By doing this, we will be able to publish exceptions asynchronusly.
2. Build an Exceptions namespace - with a base exception and some synchronous and asynchronous logging capabilities. The log will be a stub for now - it may just log to the file system.
3. Build a Caching base that will hold a set of records in a refreshable cache.
4. Build a Configuration namespace for managing name-value configurations and caching them.
5. Build a Lookup Manager that loads readonly data and caches them, in lookup objects.
6. Build a Security namespace with a principal for forms authentication.
7. Build a Connection String Configuration Manager
8. Build a Persistence namespace that manages saving and loading objects.

The Code generator can be built independently.
The Connection String configuration and the Configuration namespace can also be built independently.
The Exception logging part can aslo be independent.
The rest of it may see some intense coding.

Framework Development - Code generation

Persistence frameworks, for rapid development, usually ride on generated code. I have worked with different tools, custom, commercial, that let you point at a database table, or an XML file and generate classes. They also generate stored procedures for CRUD operations.
We faced several decisions regarding such code generators - should we use Codesmith, or build our own? Should we integrate with Visual Studio? How do we make sure the spewed code is not modified, but is extended correctly? Do we use the table as the master schema or spew XML and then generate code from XML?

Integrate with Visual Studio
I think to have a code generator plug-in is very convenient and increases productivity remarkably. But it is also troublesome to install and to supply configuration. Plus it is overkill upfront. We will use an independent generator for now and then develop (or buy) a plug in separately later.

How to extend spewed code
I have seen code generators that use a XML, XSLT model to generate code; or use templated code (as Codesmith does); or just write out lines of code from a generator. In general, you spew code from a database table schema. Then you go ahead and write some business logic in the spewed class. But the spewed class may need to be respewed for additional modifications.
To manage this, some tools detect that the spewed class has changed and show a manual merging tool.
Ideally the class file should not be modified at all - but that is almost impossible to enforce. I have seen some good innovative solutions using partial classes (in .NET 2.0). You split the same class into a spewed, unchangeable class file and a changeable class file. You can add business logic to the changeable class file.
You can manage the same using a spewed, unchageable base class and a changeable sub class.
Visual Studio, of course, spews code all the time - when designing WinForms screens. In MFC using Visual C++, they would just request you not to change some lines.
I think actively trying to prevent developers from changing the spewed code is kind of overkill (atleast in internal applications). I think you can use a convention of storing spewed classes in a separate folder and then hoping that nobody changes them. On the other extreme, we can avoid spewing code at all; generate and compile the code dynamically during the build process using a custom build action. That makes sure nobody can modify these classes. (XML Serialization does something similar).
For now, I am tending towards a loose approach - we spew the code and hope everybody changes the subclasses (which are also spewed along with the base classes). If necessary we will tighten up later (by taking the person who changes the spewed code outside and shooting him).

XML is king?
One of the ideas is that we spew XML (with column names, datatypes) from the table schema and use that as the basis for generating code. This has its advantages - you can add some custom properties in the XML and regenerate code. What if you wanted a property to be readonly? You can add that as a custom attribute to the XML.
The core issue is that a table schema does not contain ALL the information that you can play around with in a C# class. As we saw, marking a property as readonly is an example. So, philosphically I like the XML file idea. But it is not practical to maintain a separate spewed XML file AND a spewed class file.
The problem is that people can mess up the XML and cause the table schema and XML to grow apart.

So, for now, no XML. We will be generating code from a generator; storing in separate folders; and then subclassing from these classes, if we want to extend them.

Labels: , ,

Application Framework Development

We are discussing the development of a Microsoft-Application-Block style framework that would offer the following services:
1. Persistence
2. Caching
3. Lookup Management
4. Workflow
5. Multi-threaded tasks
6. Code Generation (for sprocs, persistence classes, lookup classes)
7. Configuration Management for connection strings
8. Exception Management, Logging Tracing
9. Security (Authetication, Authorization)
10. Events and Asynchronous wrappers

The project is ambitious and we have a few models to choose from. I will be posting on this (hopefully) regularly from now on.
The purpose of this framework is to help quickly develop, intranet web applications - such as an internal Library Management App, a Resume Management App, a Timesheet Tracker.
We have not yet decided if the framework will be open source - we will cross that bridge when it comes. Let us see if we complete it first.

The first decision we made was that we would use .NET because all of us are familiar with it. But, we also decided we will keep the patterns documented so that we can port to Java at some point.
It will use .NET 2.0 and hence features such as Generics will be used. It will not use .NET 3.0 for Workflow Management - for now. I am leaving that option open. We will use patterns from the Workflow Foundation.

The applications that will be built on top of this framework can be web or windows based. I like service based applications - because they are easy to switch between using Ajax or Flex for front end. We do not plan to build our own UI controls.

C# is the language of our choice.
NUnit is our choice for a testing tool.

Our primary target database server will be SqlServer 2005. But, I want to design the framework with MySql support possible in the future. This is not very tough to do; the MS Data Access Application Block has a nice set of patterns for such database agnosticism.

It is always a challenge in these kind of frameworks to decide which areas should be delegated to third party providers (in open source). One example is the Logging framework - Log4net is an obvious choice. It supports logging to the file system, event log or to the database. It supports different levels of configuration based logging. The only issue I have with third party software, particularly open source, is that they do not upgrade to later versions at your convenience. I am almost convinced we should use Log4net. It is also an issue of complexity though. It is likely we will standardize on a database log. We know that database server will be Sql Server. Do we really need an "appender" based, extensible system such as Log4net provides?

Other than Log4net, the other area that may need a third party software (although not necessarily free) is Code Generation. I will write about this in detail in the next post.

We plan to follow Microsoft standard naming conventions.
Comments? What comments?

Labels: ,

Wednesday, July 04, 2007

Reengineering a website

I have been working on rearchitecting a public website for the last few months. The experience has been instructive.
The website has a large class library in VJ++ components; at least a dozen databases; and hundreds of ASP pages. Our goals were:
1. To shift the code base to .NET and use an application framework designed by us.
2. To exploit this opportunity to redesign the business logic; split it into a set of services; and thus simplify maintenance and enhance performance.

There were several major modules in the website. We decided to approach each of these separately and organize the project into several phases for each module. By doing so, we consciously tried to decouple these modules - if module A was dependent on module B, we would try to isolate these dependencies into a service layer that implemented a facade. For example, the member information was used by all modules - and so we built a read-only service that provided the other modules with the member information.

The approach we took for each major module was this:
Step 1 - Discovery
1. Collect a high number of internal documents - both business and technical - that were available and keep them in a repository.
2. Navigate through the website ourselves and try collecting Database profiler data (such as stored procedures executed) and fiddler data (fiddler is a http debugger from Microsoft). This gave us the client-to-middle tier interface and the middle tier-to-database interface. We put these together in a sequence diagram.
3. List the database tables and try to infer their role in the application flow. Tie these to the higher level workflow in the application.
4. If the existing middle-tier was well done, we will be lucky to find a set of manager or service interfaces that are used to encapsulate business logic. List such manager classes and their interfaces.
5. Talk to the business users. Walk through the backend administration systems, if any. usually big websites also have administrative systems (such as accounting billing systems) that write some data to the database tables. Taking a look at them helps us to understand the tables better and also keep those backoffice systems in mind when redesigning the main website.
At this stage the goal is to collect as much data as possible. Vague ideas about the future architecture of the system and the services boundary float around and are captured. But the goal is to get our heads around the system.
In itself, the Discovery phase is useful, because it lets us collect and organize informationin a single place. In our case we dumped this information into a huge document with many links to sub documents. It was raw data.

Step II - Analysis
In the second phase we block each different subsystems in the overall website and discuss their inner workings. The data we collected in Phase I is useful for this (particularly the Fiddler and Profiler information). We also organize the discovery documentation and enhance it with sequences and flow charts. This may need copying information from some existing documents.
The Analysis is best done as a team with at least 2 members. It is a good idea to include an existing resource working on the maintenance of the legacy website.
The output of analysis is a better organized document.
At this time, we can also document the rearchitectural ideas that we have; any model that we would like the new design to be built on. For example, we may think that a subsystem can be split into a read only service and a read-write service. We may also have concrete ideas on the database model. It is better to document these and start discussing merits and de-merits. Such discussions tend to vanish if not documented.

Step III - Building Specifications
For such reengineering projects, it is unusual to get concrete functional specifications (other than the single line "it should look and feel the same as the existing site").
Therefore, in this phase we can move to building an architecture document. This document will cover each one of the subsystems that we are redesigning. If there are new subsystems being introduced, it will cover that. It will cover the Rollout plan - this is critical in a website that cannot afford downtime. It will aslo cover any integration plans.
Some of the subsystems could be straight ports from J++ to .NET. These have to be mentioned in the architecture document.
Some of these redesigns may require:
Digging deeper into the existing code to uncover business logic or any violations of our model.
Building Proof-Of-Concept (POC) - in our case we had to do this for the Flash-to-service layer integration.

Step IV - Technical Specifications and Estimation
From this point on the project feels like a development project and goes through similar iterations.

The collection of as much data as possible about the existing (legacy) system is crucial for the rearchitecture to be a success.

Thanks to Mr.Jorge Gonzalez for sharing some of these approaches and ideas with me.

Labels: , ,