This is a long
pending blog worked on quite few months this is closer to heart as I
believe I am a programmer at heart so to make this fit for publishing
is a great debate with in myself, so at last I am publishing this
after spending some 6 hrs on Sunday on this, hope you would like this
as good guidance point.
The complexity of a
program can be particularly confounding, because there isn’t
anything to put your hands on. When it breaks, you can’t pick up
something solid and look around inside it. It’s all abstract, and
that can be really hard to deal with. In fact, the average computer
program is so complex that no person could comprehend how all the
code works in its entirety. The bigger programs get, the more this is
the case. Thus, programming has to become the act of reducing
complexity to simplicity. Otherwise, nobody could keep working on a
program after it reached a certain level of complexity.
Superior coding
techniques and programming practices are hallmarks of a professional
programmer. The bulk of programming consists of making a large number
of small choices while attempting to solve a larger set of problems.
How wisely those choices are made depends largely upon the
programmer's skill and expertise.
This document
addresses some fundamental coding techniques and provides a
collection of coding practices from which to learn. The coding
techniques are primarily those that improve the readability and
maintainability of code, whereas the programming practices are mostly
performance enhancements.
The readability of
source code has a direct impact on how well a developer comprehends a
software system. Code maintainability refers to how easily that
software system can be changed to add new features, modify existing
features, fix bugs, or improve performance. Although readability and
maintainability are the result of many factors, one particular facet
of software development upon which all developers have an influence
is coding technique. The easiest method to ensure that a team of
developers will yield quality code is to establish a coding standard,
which is then enforced at routine code reviews.
The complex pieces
of a program have to be organized in some simple way so that a
programmer can work on it without having God-like mental abilities.
That is the art and talent involved in programming—reducing
complexity to simplicity.
A “bad programmer”
is just somebody who fails to reduce the complexity. Many times this
happens because people believe that they are reducing the complexity
of writing in the programming language (which is definitely a
complexity all in itself) by writing code that “just works,”
without thinking about reducing the complexity for other programmers.
It’s sort of like
this. Imagine an engineer who, in need of something to pound a nail
into the ground with, invents a device involving pulleys, strings,
and a large magnet. You’d probably think that was pretty
ridiculous.
Now imagine that
somebody tells you, “I need some code that I can use in any
program, anywhere, that will communicate between any two computers,
using any medium imaginable.” That’s definitely harder to reduce
to something simple. So, some programmers (perhaps most programmers)
in that situation will come up with a solution that involves the
equivalent of strings and pulleys and a large magnet, that is only
barely comprehensible to other people. They’re not irrational, and
there’s nothing wrong with them. When faced with a really difficult
task, they will do what they can in the short time they have. What
they make will work, as far as they’re concerned. It will do what
it’s supposed to do. That’s what their boss wants, and that’s
what their customers seem to want, as well. But one way or another,
they will have failed to reduce the complexity to simplicity. Then
they will pass this device off to another programmer, and that
programmer will add to the complexity by using it as part of her
device. The more people who don’t act to reduce the complexity, the
more incomprehensible the program becomes.
As a program
approaches infinite complexity, it becomes impossible to find all the
problems with it. Jet planes cost millions or billions of dollars
because they are close to this complex and were “debugged.” But
most software only costs the customer about $50–$100. At that
price, nobody’s going to have the time or resources necessary to
shake out all of the problems from an infinitely complex system. So,
a “good programmer” should do everything in his power to make
what he writes as simple as possible to other programmers. A good
programmer creates things that are easy to understand, so that it’s
really easy to shake out all the bugs.
Now, sometimes this
idea of simplicity is misunderstood to mean that programs should not
have a lot of code, or shouldn’t use advanced technologies. But
that’s not true. Sometimes a lot of code actually leads to
simplicity; it just means more writing and more reading, which is
fine. You have to make sure that you have some short document
explaining the big mass of code, but that’s all part of reducing
complexity. Also, usually more advanced technologies lead to more
simplicity, even though you have to learn about them first, which can
be troublesome.
Some people believe
that writing in a simple way takes more time than quickly writing
something that “does the job.” Actually, spending a little more
time writing simple code turns out to be faster than writing lots of
code quickly at the beginning and then spending a lot of time trying
to understand it later. That’s a pretty big simplification of the
issue, but programming-industry history has shown it to be the case.
Many great programs have stagnated in their development over the
years just because it took so long to add features to the complex
beasts they had become. And that is why computers fail so
often—because in most major programs out there, many of the
programmers on the team failed to reduce the complexity of the parts
they were writing. Yes, it’s difficult. But it’s nothing compared
to the endless difficulty that users experience when they have to use
complex, broken systems designed by programmers who failed to
simplify.
Commenting &
Documentation : IDE's (Integrated Development Environment) have
come a long way in the past few years. This made commenting your code
more useful than ever. Following certain standards in your comments
allows IDE's and other tools to utilize them in different ways.
Consistent
Indentation : I assume you already know that you should indent
your code. However, it's also worth noting that it is a good idea to
keep your indentation style consistent. There are more than one way
of indenting code.
Avoid Obvious
Comments : Commenting your code is fantastic, however, it can be
overdone or just be plain redundant. When the text is that obvious,
it's really not productive to repeat it within comments. If you must
comment on that code, you can simply combine it to a single line or
few words will suffice.
Code Grouping
: More often than not, certain tasks require a few lines of code. It
is a good idea to keep these tasks within separate blocks of code,
with some spaces between them.
Consistent Naming
Scheme : Follow a consistent naming conventions so every one in
team & using the similar technology will be able to understand
your creations
File and Folder
Organization : Technically, you could write an entire application
code within a single file. But that would prove to be a nightmare to
read and maintain.
During my initial
learning of programming at my bachelors, I knew about the idea of
creating "include files." (in C/C++) However, I was not
yet even remotely organized. I created an "inc" folder,
with two files in it: db.php and functions.php. As the applications
grew, the functions file also became huge and un-maintainable. One
of the best approaches is to either use a framework, or imitate their
folder structure
Using solid coding
techniques and good programming practices to create high quality code
plays an important role in software quality and performance. In
addition, by consistently applying a well-defined coding standard and
proper coding techniques, and holding routine code reviews, a team of
programmers working on a software project is more likely to yield a
software system that is easier to comprehend and maintain.
Superior coding
techniques and programming practices are hallmarks of a professional
programmer. The bulk of programming consists of making a large number
of small choices while attempting to solve a larger set of problems.
How wisely those choices are made depends largely upon the
programmer's skill and expertise.
This blog addresses
some fundamental coding techniques and provides a collection of
coding practices from which to learn. The coding techniques are
primarily those that improve the readability and maintainability of
code, whereas the programming practices are mostly performance
enhancements.
The readability of
source code has a direct impact on how well a developer comprehends a
software system. Code maintainability refers to how easily that
software system can be changed to add new features, modify existing
features, fix bugs, or improve performance. Although readability and
maintainability are the result of many factors, one particular facet
of software development upon which all developers have an influence
is coding technique. The easiest method to ensure that a team of
developers will yield quality code is to establish a coding standard,
which is then enforced at routine code reviews.
Coding Standards
and Code Reviews :A comprehensive coding standard encompasses all
aspects of code construction and, while developers should exercise
prudence in its implementation, it should be closely followed.
Completed source code should reflect a harmonized style, as if a
single developer wrote the code in one session. At the inception of a
software project, establish a coding standard to ensure that all
developers on the project are working in concert. When the software
project will incorporate existing source code, or when performing
maintenance upon an existing software system, the coding standard
should state how to deal with the existing code base.
Although the primary
purpose for conducting code reviews throughout the development life
cycle is to identify defects in the code, the reviews can also be
used to enforce coding standards in a uniform manner. Adherence to a
coding standard can only be feasible when followed throughout the
software project from inception to completion. It is not practical,
nor is it prudent, to impose a coding standard after the fact.
Coding Techniques
: Coding techniques incorporate many facets of software
development and, although they usually have no impact on the
functionality of the application, they contribute to an improved
comprehension of source code. For the purpose of this document, all
forms of source code are considered, including programming,
scripting, markup, and query languages.
The coding
techniques defined here are not proposed to form an inflexible set of
coding standards. Rather, they are meant to serve as a guide for
developing a coding standard for a specific software project.
The coding
techniques are divided into three sections:
Names
Comments
Format
coding techniques -
Names : Perhaps one of the most influential aids to understanding the
logical flow of an application is how the various elements of the
application are named. A name should tell "what" rather
than "how." By avoiding names that expose the underlying
implementation, which can change, you preserve a layer of abstraction
that simplifies the complexity. For example, you could use
GetNextStudent() instead of GetNextArrayElement().
A tenet of naming is
that difficulty in selecting a proper name may indicate that you need
to further analyze or define the purpose of an item. Make names long
enough to be meaningful but short enough to avoid being wordy.
Programmatically, a unique name serves only to differentiate one item
from another. Expressive names function as an aid to the human
reader; therefore, it makes sense to provide a name that the human
reader can comprehend. However, be certain that the names chosen are
in compliance with the applicable language's rules and standards.
Suggested naming
techniques: Routines -Avoid elusive names that are open to subjective
interpretation, such as Analyze() for a routine, or xxK8 for a
variable. Such names contribute to ambiguity more than abstraction.
In object-oriented languages, it is redundant to include class names
in the name of class properties, such as Book.BookTitle. Instead, use
Book.Title.
Use the verb-noun
method for naming routines that perform some operation on a given
object, such as CalculateInvoiceTotal().
In languages that
permit function overloading, all overloads should perform a similar
function. For those languages that do not permit function
overloading, establish a naming standard that relates similar
functions. Variables - Append computation qualifiers (Avg, Sum, Min,
Max, Index) to the end of a variable name where appropriate.
Use customary
opposite pairs in variable names, such as min/max, begin/end, and
open/close.
Since most names are
constructed by concatenating several words together, use mixed-case
formatting to simplify reading them. In addition, to help distinguish
between variables and routines, use Pascal casing
(CalculateInvoiceTotal) for routine names where the first letter of
each word is capitalized. For variable names, use camel casing
(documentFormatType) where the first letter of each word except the
first is capitalized.
Boolean variable
(Flag/s)names should contain Is which implies Yes/No or True/False
values, such as fileIsFound. Avoid using terms such as Flag(Is is
just 2 letters than 4) when naming status variables, which differ
from Boolean variables in that they may have more than two possible
values. Instead of documentFlag, use a more descriptive name such as
docFormatType.
Even for a
short-lived variable that may appear in only a few lines of code,
still use a meaningful name. Use single-letter variable names, such
as i, or j, for short-loop indexes only.
If using Charles
Simonyi's Hungarian Naming Convention, or some derivative thereof,
develop a list of standard prefixes for the project to help
developers consistently name variables. For more information, see
"Hungarian Notation."
For variable names,
it is sometimes useful to include notation that indicates the scope
of the variable, such as prefixing a g_ for global variables and m_
for module-level variables.
Constants should be
all uppercase with underscores between words, such as
NUM_DAYS_IN_WEEK. Also, begin groups of enumerated types with a
common prefix, such as FONT_ARIAL and FONT_ROMAN.
Tables: When naming
tables, express the name in the singular form. For example, use
Employee instead of Employees. When naming columns of tables, do not
repeat the table name; for example, avoid having a field called
EmployeeLastName in a table called Employee.
Do not incorporate
the data type in the name of a column. This will reduce the amount of
work needed should it become necessary to change the data type later.
Do not prefix stored
procedures with sp_, because this prefix is reserved for identifying
system-stored procedures. In Transact-SQL, do not prefix variables
with @@, which should be reserved for truly global variables such as
@@IDENTITY.
Minimize the use of
abbreviations. If abbreviations are used, be consistent in their use.
An abbreviation should have only one meaning and likewise, each
abbreviated word should have only one abbreviation. For example, if
using min to abbreviate minimum, do so everywhere and do not later
use it to abbreviate minute.
When naming
functions, include a description of the value being returned, such as
GetCurrentWindowName().
File and folder
names, like procedure names, should accurately describe what purpose
they serve.
Avoid reusing names
for different elements, such as a routine called ProcessSales() and a
variable called iProcessSales. Avoid homonyms when naming elements to
prevent confusion during code reviews, such as write and right. When
naming elements, avoid using commonly misspelled words. Also, be
aware of differences that exist between American and British English,
such as color/colour and check/cheque. Avoid using typographical
marks to identify data types, such as $ for strings or % for
integers.
Comments : Software
documentation exists in two forms, external and internal. External
documentation is maintained outside of the source code, such as
specifications, help files, and design documents. Internal
documentation is composed of comments that developers write within
the source code at development time.
One of the
challenges of software documentation is ensuring that the comments
are maintained and updated in parallel with the source code. Although
properly commenting source code serves no purpose at run time, it is
invaluable to a developer who must maintain a particularly intricate
or cumbersome piece of software.
Following are
recommended commenting techniques:
When modifying code,
always keep the commenting around it up to date.
At the beginning of
every routine, it is helpful to provide standard, boilerplate
comments, indicating the routine's purpose, assumptions, and
limitations. A boilerplate comment should be a brief introduction to
understand why the routine exists and what it can do.
Avoid adding
comments at the end of a line of code; end-line comments make code
more difficult to read. However, end-line comments are appropriate
when annotating variable declarations. In this case, align all
end-line comments at a common tab stop. Avoid using clutter comments,
such as an entire line of asterisks. Instead, use white space to
separate comments from code. Avoid surrounding a block comment with a
typographical frame. It may look attractive, but it is difficult to
maintain.
Prior to deployment,
remove all temporary or extraneous comments to avoid confusion during
future maintenance work. If you need comments to explain a complex
section of code, examine the code to determine if you should rewrite
it. If at all possible, do not document bad code—rewrite it.
Although performance should not typically be sacrificed to make the
code simpler for human consumption, a balance must be maintained
between performance and maintainability. Use complete sentences when
writing comments. Comments should clarify the code, not add
ambiguity.
Comment as you
code, because most likely there won't be time to do it later.
Also, should you get a chance to revisit code you've written, that
which is obvious today probably won't be obvious six weeks from now.
Avoid the use of superfluous or inappropriate comments, such as
humorous sidebar remarks.
Use comments to
explain the intent of the code. They should not serve as inline
translations of the code. Comment anything that is not readily
obvious in the code. To prevent recurring problems, always use
comments on bug fixes and work-around code, especially in a team
environment.
Use comments on code
that consists of loops and logic branches. These are key areas that
will assist the reader when reading source code. Separate comments
from comment delimiters with white space. Doing so will make comments
stand out and easier to locate when viewed without color clues.
Throughout the
application, construct comments using a uniform style, with
consistent punctuation and structure.
Notes: Despite the
availability of external documentation, source code listings should
be able to stand on their own because hard-copy documentation can be
misplaced. External documentation should consist of specifications,
design documents, change requests, bug history, and the coding
standard that was used.
Format : Formatting
makes the logical organization of the code stand out. Taking the time
to ensure that the source code is formatted in a consistent, logical
manner is helpful to yourself and to other developers who must
decipher the source code. Establish a standard size for an indent,
such as four spaces, and use it consistently. Align sections of code
using the prescribed indentation. Use a monospace font when
publishing hard-copy versions of the source code. Except for
constants, which are best expressed in all uppercase characters with
underscores, use mixed case instead of underscores to make names
easier to read. Align open and close braces vertically where brace
pairs align.
You can also use a
slanting style, where open braces appear at the end of the line and
close braces appear at the beginning of the line.
Whichever style is
chosen, use that style throughout the source code. Indent code along
the lines of logical construction. Without indenting, code becomes
difficult to follow. Indenting the code yields easier-to-read code.
Establish a maximum
line length for comments and code to avoid having to scroll the
source code window and to allow for clean hard-copy presentation. Use
spaces before and after most operators when doing so does not alter
the intent of the code. For example, an exception is the pointer
notation used in C++. Put a space after each comma in comma-delimited
lists, such as array values and arguments, when doing so does not
alter the intent of the code. For example, an exception is an
ActiveX® Data Object (ADO) Connection argument.
Use white space to
provide organizational clues to source code. Doing so creates
"paragraphs" of code, which aid the reader in comprehending
the logical segmenting of the software.
When a line is
broken across several lines, make it obvious that the line is
incomplete without the following line.
Where appropriate,
avoid placing more than one statement per line. An exception is a
loop in C, C++, Visual J++®, or JScript®, such as for (i = 0; i <
100; i++).
When writing HTML,
establish a standard format for tags and attributes, such as using
all uppercase for tags and all lowercase for attributes. As an
alternative, adhere to the XHTML specification to ensure all HTML
documents are valid. Although there are file size trade-offs to
consider when creating Web pages, use quoted attribute values and
closing tags to ease maintainability. When writing SQL statements,
use all uppercase for keywords and mixed case for database elements,
such as tables, columns, and views.
Divide source code
logically between physical files. In ASP, use script delimiters
around blocks of script rather than around each line of script or
interspersing small HTML fragments with server-side scripting. Using
script delimiters around each line or interspersing HTML fragments
with server-side scripting increases the frequency of context
switching on the server side, which hampers performance and degrades
code readability. Put each major SQL clause on a separate line so
statements are easier to read and edit.
Do not use literal
numbers or literal strings, such as For i = 1 To 7. Instead, use
named constants, such as For i = 1 To NUM_DAYS_IN_WEEK, for ease of
maintenance and understanding. Break large, complex sections of code
into smaller, comprehensible modules.
Programming
Practices : Experienced developers follow numerous programming
practices or rules of thumb, which typically derived from
hard-learned lessons. The practices listed below are not
all-inclusive, and should not be used without due consideration.
Veteran programmers deviate from these practices on occasion, but not
without careful consideration of the potential repercussions. Using
the best programming practice in the wrong context can cause more
harm than good.
To conserve
resources, be selective in the choice of data type to ensure the size
of a variable is not excessively large.
Keep the lifetime
of variables as short as possible when the variables represent a
finite resource for which there may be contention, such as a database
connection.
Keep the scope of
variables as small as possible to avoid confusion and to ensure
maintainability. Also, when maintaining legacy source code, the
potential for inadvertently breaking other parts of the code can be
minimized if variable scope is limited.
Use variables and
routines for one and only one purpose. In addition, avoid creating
multipurpose routines that perform a variety of unrelated functions.
When writing
classes, avoid the use of public variables. Instead, use procedures
to provide a layer of encapsulation and also to allow an opportunity
to validate value changes.
When using objects
pooled by MTS, acquire resources as late as possible and release them
as soon as possible. As such, you should create objects as late as
possible, and destroy them as early as possible to free resources.
When using objects that are not being pooled by MTS, it is necessary
to examine the expense of the object creation and the level of
contention for resources to determine when resources should be
acquired and released. Use only one transaction scheme, such as MTS
or SQL Server™, and minimize the scope and duration of
transactions.
Be wary of using ASP
Session variables in a Web farm environment. At a minimum, do not
place objects in ASP Session variables because session state is
stored on a single machine. Consider storing session state in a
database instead.
Stateless components
are preferred when scalability or performance are important. Design
the components to accept all the needed values as input parameters
instead of relying upon object properties when calling methods. Doing
so eliminates the need to preserve object state between method calls.
When it is necessary to maintain state, consider using alternative
methods, such as maintaining state in a database.
Do not open data
connections using a specific user's credentials. Connections that
have been opened using such credentials cannot be pooled and reused,
thus losing the benefits of connection pooling.
Avoid the use of
forced data conversion, sometimes referred to as variable coercion or
casting, which may yield unanticipated results. This occurs when two
or more variables of different data types are involved in the same
expression. When it is necessary to perform a cast for other than a
trivial reason, that reason should be provided in an accompanying
comment.
Develop and use
error-handling routines. Be specific when declaring objects, such as
ADODB.Recordset instead of just Recordset, to avoid the risk of name
collisions.
Require the use
Option Explicit in Visual Basic and VBScript to encourage forethought
in the use of variables and to minimize errors resulting from
typographical errors.
Avoid the use of
variables with application scope. Use RETURN statements in stored
procedures to help the calling program know whether the procedure
worked properly. Use early binding techniques whenever possible. Use
Select Case or Switch statements in lieu of repetitive checking of a
common variable using If…Then statements.
Explicitly release
object references.
Data-Specific
Never use SELECT *. Always be explicit in which columns to
retrieve and retrieve only the columns that are required. Refer to
fields implicitly; do not reference fields by their ordinal placement
in a Recordset. Use stored procedures in lieu of SQL statements in
source code to leverage the performance gains they provide. Use a
stored procedure with output parameters instead of single-record
SELECT statements when retrieving one row of data. Verify the row
count when performing DELETE operations. Perform data validation at
the client during data entry. Doing so avoids unnecessary round trips
to the database with invalid data.
Avoid using
functions in WHERE clauses. If possible, specify the primary key
in the WHERE clause when updating a single row. When using LIKE, do
not begin the string with a wildcard character because SQL Server
will not be able to use indexes to search for matching values. Use
WITH RECOMPILE in CREATE PROC when a wide variety of arguments are
passed, because the plan stored for the procedure might not be
optimal for a given set of parameters.
Stored procedure
execution is faster when you pass parameters by position (the order
in which the parameters are declared in the stored procedure) rather
than by name. Use triggers only for data integrity enforcement and
business rule processing and not to return information. After each
data modification statement inside a transaction, check for an error
by testing the global variable @@ERROR.
Use
forward-only/read-only recordsets. To update data, use SQL INSERT and
UPDATE statements.
Never hold locks
pending user input. Use uncorrelated subqueries instead of correlated
subqueries. Uncorrelated subqueries are those where the inner SELECT
statement does not rely on the outer SELECT statement for
information. In uncorrelated subqueries, the inner query is run once
instead of being run for each row returned by the outer query.
ADO-Specific
Tune the
RecordSet.CacheSize property to what is needed. Using too small or
too large a setting will adversely impact the performance of an
application.
Bind columns to
field objects when looping through recordsets. For Command objects,
describe the parameters manually instead of using Parameters.Refresh
to obtain parameter information.
Explicitly close ADO
Recordset and Connection objects to insure that connections are
promptly returned to the connection pool for use by other processes.
Use adExecuteNoRecords for non-row-returning commands.
Solution Design:
Every programmer working on a software project is involved in design.
The lead developer is in charge of designing the overall architecture
of the entire program. The senior programmers are in charge of
designing their own large areas. And the junior programmers are in
charge of designing their parts of the program, even if they’re as
simple as one part of one file. There is even a certain amount of
design involved in writing a single line of code.
Even when you are
programming all by yourself, there is still a design process that
goes on. Sometimes you make a decision immediately before your
fingers hit the keyboard, and that’s the whole process. Sometimes
you think about how you’re going to write the program when you’re
in bed at night.
There are three
broad mistakes that software designers make when attempting to cope
with the Law of
Change, listed here in order of how common they are:
1. Writing code that
isn’t needed
2. Not making the
code easy to change
3. Being too generic
Don’t write
code until you actually need it, and remove any code that isn’t
being used.
One of the great
killers of software projects is what we call “rigid design.” This
when a programmer designs code in a way that is difficult to change.
There are two ways to get a rigid design:
1. Make too many
assumptions about the future.
2. Write code
without enough design.
Code should be
designed based on what you know now, not on what you
think will happen in
the future.
When faced with the
fact that their code will change in the future, some developers
attempt to solve the problem by designing a solution so generic that
(they believe) it will accommodate every possible future situation.
We call this “overengineering.”
The dictionary
defines overengineering as a combination of “over” (meaning “too
much”) and “engineer” (meaning “design and build”). So, per
the dictionary, it means designing or building too much for your
situation.
Be only as
generic as you know you need to be right now. There is a method
of software development that avoids the three flaws by its very
nature, called “incremental development and design.” It involves
designing and building a system piece by piece, in order.
Conclusion :
Using solid coding techniques and good programming practices to
create high quality code plays an important role in software quality
and performance. In addition, by consistently applying a well-defined
coding standard and proper coding techniques, and holding routine
code reviews, a team of programmers working on a software project is
more likely to yield a software system that is easier to comprehend
and maintain. The ease of maintenance of any piece of software is
proportional to the simplicity of its individual pieces.
Feel free to contact me at ravindrapande@gmail.com. I would like to
understand if I am missing some important angle in this, also your view on my writing as well.