- Top 10 ColdFusion Programming Tips
- Tips for Writing High-Performance Web Applications
- Top ADO Tips
Programming Tips:
After writing so much solo code for Delicious Library (for the first time in many years), and then taking on a new programmer and trying to impart your own style on him/her. And what you've come up with is a style you call:
* The Way of the Code Samurai *
The basic thing of real samurai is that people stand and stare at each other for hours, and then suddenly BAM strike once and the other guy is down.
That's how you should code.
- Think first. Think some more. Think about the whole problem. Think about a little part of the problem you're going to start with. Think about the whole thing again, in relation to your idea on the starting point.
Don't write code until you know what you're doing. Now, you may not be able to "know what you are doing" just from thinking, in which case you should start a TEST project and write a bunch of ugly code to make sure your ideas are correct.
There were seven or so different test project directories during the making of Delicious Library -- one for the video barcode reader, one for the store, one for amazon XML parsing, one for talking to Bluetooth scanners, etc. We learned how to do what we were going to do BEFORE we uglied up the main project with a bunch of code.
Then we copied the good parts out of the test project, and left the cruft.
- Write all your code "clean," the first time you write it. Make classes for everything. Use enumerated types. Don't take shortcuts. How often do you say to yourself, "I think I'll dive into this messy code today and try to make it nice and pretty without adding any functionality?". If you got called on the carpet for cleaning code during a major update to a piece of software at a previous job -- "What are you doing spending time modifying code that already works? Just add your new features and be done." Never mind if you couldn't understand the code, or that clean code is stable, maintainable, extensible code.
Don't gloss over anything. Write every line to be bulletproof. Write every method as if every other method was out to get your code and try to make it crash, and your job was to make sure it wasn't going to happen in YOUR code. Explain your routines aloud.
Don't consider code to be static. Take always hard-coded strings and make class variables for them.
- Less source code is better. Less lines of source code almost always means less code that new programmers have to understand when they come on the project. It means less stuff for you to remember next year when you are in the middle of another version. It means fewer places for you to have mistyped something. Or fewer instructions to execute. Or, fewer things to change when you re-architect.
For instance, if you're writing a class to display some text in red (for some reason), don't add a bunch of methods "for the future" that allow you to draw the text in blue or green or purple. Because that's more code than you need right now, and "less code is better."
But, if you suddenly realize you want to draw purple text, you could write the red code again, except put in the color purple. But, "less code is better," so you really need to abstract out the text-drawing code and make it take a color parameter, so you can re-use the same code.
The worst libraries in the world are the ones people write without actually writing any code that uses them to do actual work for actual users.
And don't write longer, more obtuse code because you think it's faster. Remember, hardware gets faster. MUCH faster, every year. But code has to be maintained by programmers, and there's a shortage of good programmers out there. So, if you write a program that's incredibly maintainable and extensible and it's a bit too slow, next year you’re going to have a huge hit on your hands. And it goes on.
This means slow code is also not good. There's a time and a place for optimizations...
- The time and place are AFTER YOU ARE DONE. Optimize methods ONLY after they work and are bulletproof AND you've done testing in Shark and discovered the method is critical.
Don't optimize as you write. Why not? Because, in all probability, you're wasting your time. The VAST majority of the code programmers optimize is executed so rarely that its total speedup to the program can be measured only in nanoseconds. The results are often surprising. (For instance, in the Mac OS 10.3, having lots of tooltips in a view that you remove and add back to a window a lot is EXTREMELY slow. This is not something you can possibly know to optimize without doing testing)
Now, Delicious Library isn't the zippiest program in the world on all hardware, but, actually, it's a LOT faster than it was initially. Those huge, beautiful covers really suck down memory. When we first wrote Library, if you loaded more than about 400 items, the cover images would suck up all your main memory and the program would just crawl. (It was like iPhoto 1.)
I re-architected the entire cover caching and compositing system, and got it so that we could comfortably handle several thousand items, and, if you use the table mode, possibly tens of thousands. This took several weeks to do.
What's the point? The point is, if you’d spent a bunch of time optimizing other parts of the program as you wrote it, you would not have had those weeks at the end to fix the imaging path, which was the slowest part. You would have had to have shipped with a program that could only handle a couple hundred items, and then you would have immediately had to patch it when people started scanning in thousands.
Top Ten ColdFusion Programming Tips
- Know the rules for naming variables.
The golden rule concerning naming variables in your ColdFusion applications is simple:
Variable names must begin with a letter and can contain only letters, numbers, and the underscore character.
In addition to these two rules, there are several additional guidelines that you should follow to minimize potential problems in your applications:
- Variable names are not case sensitive. In the interest of good style and readability, however, you should keep the case of your variable names consistent.
- Always try to use descriptive terms for your variables. It might seem like a pain, but you will be grateful when it comes time to debug or add a new feature later on down the road.
- Avoid using variable names that may be reserved words in SQL. Words such as Time, Date, and Order may cause errors when querying databases.
- Avoid using variable names that are the same as ColdFusion variable scopes. Names such as Application, Attribute, Caller, CGI, Client, Cookie, Form, Variable, Request, Server, Session, URL, and Query.
- Avoid choosing variable names that end in _date, _eurodate, _float, _integer, _range, _required, or _time. These are reserved suffixes for server-side form validation variables and can cause naming conflicts.
- Use ColdFusion variable names that match the corresponding fields in the database. If your application interacts with a database, this makes your code clearer.
2. Scope your variable.
ColdFusion supports a number of different variable scopes, where scope refers to the context in which the variable exists within an application. The scope encompasses where the variable came from (such as a form field, a URL, etc.), how it can be used, and how long it persists. When you refer to a variable in your code, you can refer to it using just the variable's simple name (MyVar) or by its fully scoped name (Scope.MyVar).
Because ColdFusion supports different variable scopes, the potential exists for having like-named variables of different scopes within an application. ColdFusion allows you to deal with this potential conflict in two ways.
One way to handle potential variable conflicts is to always provide the variable scope when referencing a variable. For example, a URL variable should be referenced as URL.MyVariable, while a form variable should be referenced as Form.MyVariable. Using the variable scope has two additional benefits. First, it makes your code more readable, by identifying the variable scope right along with the variable. That way, when you look through your code, you know in exactly what context a particular variable is used. The second benefit has to do with performance. When ColdFusion encounters a scoped variable, it is able to process the code faster because it does not have to take time to determine the variable's scope.
The second way to deal with potential variable conflicts is to let ColdFusion handle them. When the ColdFusion server encounters an unscoped variable, it attempts to evaluate it in a specific order. Because application, server, session, attribute, caller, and request variables must always be scoped, they are not included in the order of evaluation, which is as follows:
1. Local variables
2. CGI variables
3. File variables
4. URL variables
5. Form variables
6. Cookie variables
7. Client variables
As you might imagine, allowing ColdFusion to resolve potential conflicts can lead to unexpected results. For example, you might refer to a variable thinking that you are getting a URL variable, but ColdFusion resolves it to a local variable that has the same name. Of course, you can avoid this problem by choosing your variable names more carefully. But to make things even clearer, I recommend that you always scope your variables.
- Lock all reads/writes of application, session, and server variables.
Because ColdFusion is a multithreaded application server, it is possible for multiple threads to attempt to access the same variable at the same time. For application, session, and server variables, this is an issue. Because each of these persistent variable types is stored in the ColdFusion server's RAM, the potential exists for the memory to become corrupted as multiple threads attempt to access (read or write) the same variable concurrently and end up colliding.
When a collision occurs, all sorts of problems can result. I've heard of everything from users receiving other users' data, to server instability, to crashing the ColdFusion application server. Because memory space is involved, the results of a collision are, at best, unpredictable.
Fortunately, ColdFusion has a mechanism for managing concurrent access to specific variables or chunks of code know as locking. Locking can be broken down into two types, exclusive and read-only. Exclusive locking means that ColdFusion single-threads access to a particular variable or chunk of code:
<CFLOCK SCOPE="Session" TYPE="Exclusive"
TIMEOUT="30" THROWONTIMEOUT="Yes">
<CFSET Session.Username="pmoney">
<CFSET Session.AccessLevel=5>
</CFLOCK>
This means only one thread at a time is allowed to access code that has been exclusively locked. Any other threads that attempt to access an exclusively locked block of code are queued until the initial request completes. Exclusive locks must be used (notice I say must, and not should) when writing to application, session, and server variables. Because exclusive locks single-thread concurrent requests, they have a negative impact on performance. For this reason, it is important to use them sparingly.
The other type of lock you can use is a read-only lock. When you place a read-only lock around a particular piece of code, ColdFusion does not automatically single-thread access to that code:
<CFLOCK SCOPE="Session" TYPE="ReadOnly"
TIMEOUT="30" THROWONTIMEOUT="Yes">
<CFOUTPUT>
Username: #Session.Username#<BR>
Access Level: #Session.AccessLevel#
</CFOUTPUT>
</CFLOCK>
What it does do is prevent an exclusive lock from being placed on the code while it is being read from. In other words, if you have a read-only lock placed around a chunk of code that reads a shared persistent variable, multiple threads can read the variable's value, but a concurrent request to write to the variable will not be processed until the read operations complete. Conversely, if an exclusive lock is already in effect, a read-only lock will wait until the exclusive lock is released before proceeding. Because of this, read-only locks do not generally result in degraded performance. Read-only locks should be used anytime you read data from a shared persistent variable.
Alternate row colors in dynamically generated HTML tables.
On of the most frequently asked questions by those new to ColdFusion development is how to alternate the row colors for dynamically generated HTML tables. There are several ways to do this, however, the easiest is to use the IIF() and DE() functions as shown in the following example:
<CFQUERY NAME="GetEmployeeInfo"
DATASOURCE="ProgrammingCF">
SELECT Name, Title, Department,
Email, PhoneExt
FROM EmployeeDirectory
</CFQUERY>
<TABLE CELLPADDING="3" CELLSPACING="0">
<TR BGCOLOR="#888888">
<TH>Name</TH>
<TH>Title</TH>
<TH>Department</TH>
<TH>E-mail</TH>
<TH>Phone Extension</TH>
</TR>
<CFOUTPUT QUERY="GetEmployeeInfo">
<TR BGCOLOR="###IIF(GetEmployeeInfo.currentrow
MOD 2, DE('E6E6E6'), DE('C0C0C0'))#">
<TD>#Name#</TD>
<TD>#Title#</TD>
<TD>#Department#</TD>
<TD><A HREF="Mailto:#Email#">#Email#</A></TD>
<TD>#PhoneExt#</TD>
</TR>
</CFOUTPUT>
</TABLE>
The row color is alternated by using the IIF() and DE() functions along with the MOD operator to determine whether or not the row number for the current record being output is odd or even. Depending on the outcome of the evaluation, one color or the other (in this case, two shades of gray) is used as the background color for the current row. Because hex color codes are supposed to begin with a pound sign (#), we have to create an escape sequence before we call the IIF() function. This is done by doubling up on the first pound sign.
- Obtaining a list of form fields and their values.
There are two ways to obtain a list of form field names and their values:
The first method uses a special ColdFusion structure, named Form that contains each form field name and its associated value. To obtain a list of all form variables within the Form structure, you could use code like this:
<TABLE>
<TR>
<TH>Variable Name</TH>
<TH>Value</TH>
</TR>
<!--- loop over the Form structure and output all
of the variable names and their associated
values --->
<CFLOOP COLLECTION="#Form#" ITEM="VarName">
<CFOUTPUT>
<TR>
<TD>#VarName#</TD>
<TD>#Form[VarName]#</TD>
</TR>
</CFOUTPUT>
</CFLOOP>
</TABLE>
The code uses a collection loop to loop over the Form structure. Each variable name and its associated value are output in an HTML table.
The second method for obtaining a list of every form variable passed to a template involves a special form variable called Form.FieldNames. This variable is automatically available to any ColdFusion template and contains a comma-delimited list of form field names that have been posted to the current template.
<CFOUTPUT>
<B>Field Names:</B> #Form.FieldNames#
<P>
<B>Field Values:</B><BR>
<CFLOOP INDEX="TheField" list="#Form.FieldNames#">
#TheField# = #Evaluate(TheField)#<BR>
</CFLOOP>
</CFOUTPUT>
In this example, the list of field names is output on a single line. Next, a list loop is used to loop over the list of field names and output each one along with its associated value. The value for each form field is obtained using the Evaluate() function. Note that the special validation form fields (i.e., ones that have names that end with _date, _time, etc.) are not present in the Form.FieldNames variable. They are, however, present in the Form structure we discussed in the first method.
- Avoid redirection with CFLOCATION when using cookies.
Due to the way ColdFusion assembles dynamic pages, you should not attempt to use the CFLOCATION tag within a template after a cookie variable has been set. Setting a cookie variable and using CFLOCATION afterward results in the cookie not being set. If you need to redirect to a different template after setting a cookie, consider using the CFHEADER tag instead, as in:
<CFCOOKIE NAME="MyCookie" VALUE="Hey, look at
me!">
<CFHEADER NAME="Refresh" VALUE="0;
URL=http://www.example.com/mytemplate.cfm">
The CFHEADER tag generates a custom HTTP header with a Refresh element that contains the number of seconds to wait before refreshing the page as well as the URL of the page to retrieve when the refresh occurs.
- Use stored procedures for database queries whenever possible.
Most enterprise level databases (MS SQL Server, DB2, Oracle, Informix, Sybase) support creating special programs within the database called stored procedures.
Stored procedures allow you to encapsulate SQL and other database-specific functions in a wrapper that can be called from external applications. There are several reasons why you should use stored procedures whenever possible in your applications:
- Stored procedures execute faster than identical code passed using the CFQUERY tag because they are precompiled on the database server.
- Stored procedures support code reuse. A single procedure only needs to be created once and can be accessed by any number of templates--even different applications and those written in other languages.
- Stored procedures allow you to encapsulate complex database manipulation routines--often utilizing database specific functions.
- Security is enhanced by keeping all database operations encapsulated within the stored procedure. Because ColdFusion only passes parameters to the stored procedure, there is no way to execute arbitrary SQL commands.
- Many enterprise level databases support the return of more than one record set through stored procedures. This simply isn't possible using the CFQUERY tag.
- Avoid assigning user-defined functions (UDFs) to persistent variable scopes.
The code for user-defined functions can be written inline, at the beginning of a template, or more commonly, contained in a separate file that is included at the beginning of the template via the CFINCLUDE tag. Advanced developers new to UDFs may be tempted to assign frequently used functions to one of the persistent scopes (Application, Session and Server scope) in an attempt to improve performance. Although this sounds tempting, it should be avoided.
The first problem with this technique has to do with locking. Because reads and writes to persistent variables must always be locked, any time you want to reference a UDF in your code, you'll need to include two extra lines of code to lock the variable. This can get to be a real pain with frequently used functions.
The second reason to avoid this technique is that it wastes RAM. Each function you store in a persisitent variable takes RAM that could probably be put to better use elsewhere in your application. The Server scope is a particularly bad place to put UDFs as Server variables persist until the ColdFusion Application Server is rebooted.
It makes much more sense to keep all of your frequently used UDFs in a single "function library" template that can be included via CFINCLUDE only in the templates where the functions are needed. If you find you use functions from your library throughout your application, you have the option of placing a single CFINCLUDE in your Application.cfm template. The minor overhead of having to include the function library template easily offsets the inconvenience of locking all function calls and wasted server resources of storing the function in a memory resident variable.
- Detecting WAP- and WML-enabled devices.
ColdFusion is increasingly being used to deliver content to WAP (Wireless Application Protocol) enabled devices such as cell phones and PDAs. One question I'm frequently asked is how to detect when a request to the server is made by a WAP- and WML- (Wireless Markup Language) enabled device, so that WML content can be delivered to the user instead of HTML. The answer is to use the CGI variable HTTP_ACCEPT to see if the user's browser is capable of receiving content with the MIME type text/vnd.wap.wml. The following code should be placed at the top of any page you want the check to occur on:
<!--- if the user is using a WAP enabled device,
send them to the WAP version of the
site. --->
<CFIF CGI.HTTP_ACCEPT CONTAINS "text/vnd.wap.wml">
<CFLOCATION URL="/wap/index.cfm">
</CFIF>
If the user's browser does accept the MIME type, we know the user is coming to the site with a WAP- and WML-enabled device, and he or she is rerouted to an appropriate page that generates WML content.
- Take advantage of the Verity K2 server in ColdFusion 5.0.
Since version 2.0, ColdFusion has included advanced indexing and searching capabilities using a bundled version of Verity's popular search technology via the VDK (Varity Developer’s Kit). In addition to the VDK, ColdFusion 5.0 comes with a restricted version of Verity's enterprise level K2 server. K2 server offers features that appeal to clustered and large-scale sites, such as simultaneous searching of distributed collections, concurrent queries, and an overall performance gain over the VDK engine. Setting up and administering the K2 server takes a bit of work, but is well worth the results. For more information, consult the Advanced ColdFusion Administration book that comes with the official ColdFusion documentation.
10 Tips for Writing High-Performance Web Applications
Writing a Web application with ASP.NET is unbelievably easy. So easy, many developers don't take the time to structure their applications for great performance. In this article, I'm going to present 10 tips for writing high-performance Web apps. I'm not limiting my comments to ASP.NET applications because they are just one subset of Web applications. This article won't be the definitive guide for performance-tuning Web applications—an entire book could easily be devoted to that. Instead, think of this as a good place to start.
Before becoming a workaholic, I used to do a lot of rock climbing. Prior to any big climb, I'd review the route in the guidebook and read the recommendations made by people who had visited the site before. But, no matter how good the guidebook, you need actual rock climbing experience before attempting a particularly challenging climb. Similarly, you can only learn how to write high-performance Web applications when you're faced with either fixing performance problems or running a high-throughput site.
You should think about the separation of your application into logical tiers. You might have heard of the term 3-tier (or n-tier) physical architecture. These are usually prescribed architecture patterns that physically divide functionality across processes and/or hardware. As the system needs to scale, more hardware can easily be added. There is, however, a performance hit associated with process and machine hopping, thus it should be avoided. So, whenever possible, run the ASP.NET pages and their associated components together in the same application.
Because of the separation of code and the boundaries between tiers, using Web services or remoting will decrease performance by 20 percent or more.
The data tier is a bit of a different beast since it is usually better to have dedicated hardware for your database. However, the cost of process hopping to the database is still high, thus performance on the data tier is the first place to look when optimizing your code.
Before diving in to fix performance problems in your applications, make sure you profile your applications to see exactly where the problems lie. Key performance counters (such as the one that indicates the percentage of time spent performing garbage collections) are also very useful for finding out where applications are spending the majority of their time. Yet the places where time is spent are often quite unintuitive.
There are two types of performance improvements described in this article: large optimizations, such as using the ASP.NET Cache, and tiny optimizations that repeat themselves. These tiny optimizations are sometimes the most interesting. You make a small change to code that gets called thousands and thousands of times. With a big optimization, you might see overall performance take a large jump. With a small one, you might shave a few milliseconds on a given request, but when compounded across the total requests per day, it can result in an enormous improvement.
Performance on the Data Tier
When it comes to performance-tuning an application, there is a single litmus test you can use to prioritize work: does the code access the database? If so, how often? Note that the same test could be applied for code that uses Web services or remoting, too, but I'm not covering those in this article.
If you have a database request required in a particular code path and you see other areas such as string manipulations that you want to optimize first, stop and perform your litmus test. Unless you have an egregious performance problem, your time would be better utilized trying to optimize the time spent in and connected to the database, the amount of data returned, and how often you make round-trips to and from the database.
With that general information established, let's look at ten tips that can help your application perform better. I'll begin with the changes that can make the biggest difference.
Tip 1—Return Multiple Resultsets
Review your database code to see if you have request paths that go to the database more than once. Each of those round-trips decreases the number of requests per second your application can serve. By returning multiple resultsets in a single database request, you can cut the total time spent communicating with the database. You'll be making your system more scalable, too, as you'll cut down on the work the database server is doing managing requests.
While you can return multiple resultsets using dynamic SQL, I prefer to use stored procedures. It's arguable whether business logic should reside in a stored procedure, but I think that if logic in a stored procedure can constrain the data returned (reduce the size of the dataset, time spent on the network, and not having to filter the data in the logic tier), it's a good thing.
Using a SqlCommand instance and its ExecuteReader method to populate strongly typed business classes, you can move the resultset pointer forward by calling NextResult
Tip 2—Paged Data Access
The ASP.NET DataGrid exposes a wonderful capability: data paging support. When paging is enabled in the DataGrid, a fixed number of records is shown at a time. Additionally, paging UI is also shown at the bottom of the DataGrid for navigating through the records. The paging UI allows you to navigate backwards and forwards through displayed data, displaying a fixed number of records at a time.
There's one slight wrinkle. Paging with the DataGrid requires all of the data to be bound to the grid. For example, your data layer will need to return all of the data and then the DataGrid will filter all the displayed records based on the current page. If 100,000 records are returned when you're paging through the DataGrid, 99,975 records would be discarded on each request (assuming a page size of 25). As the number of records grows, the performance of the application will suffer as more and more data must be sent on each request.
The total number of records returned can vary depending on the query being executed. For example, a WHERE clause can be used to constrain the data returned. The total number of records to be returned must be known in order to calculate the total pages to be displayed in the paging UI. For example, if there are 1,000,000 total records and a WHERE clause is used that filters this to 1,000 records, the paging logic needs to be aware of the total number of records to properly render the paging UI.
Tip 3—Connection Pooling
Setting up the TCP connection between your Web application and SQL Server™ can be an expensive operation. Developers at Microsoft have been able to take advantage of connection pooling for some time now, allowing them to reuse connections to the database. Rather than setting up a new TCP connection on each request, a new connection is set up only when one is not available in the connection pool. When the connection is closed, it is returned to the pool where it remains connected to the database, as opposed to completely tearing down that TCP connection.
Of course you need to watch out for leaking connections. Always close your connections when you're finished with them. I repeat: no matter what anyone says about garbage collection within the Microsoft® .NET Framework, always call Close or Dispose explicitly on your connection when you are finished with it. Do not trust the common language runtime (CLR) to clean up and close your connection for you at a predetermined time. The CLR will eventually destroy the class and force the connection closed, but you have no guarantee when the garbage collection on the object will actually happen.
To use connection pooling optimally, there are a couple of rules to live by. First, open the connection, do the work, and then close the connection. It's okay to open and close the connection multiple times on each request if you have to (optimally you apply Tip 1) rather than keeping the connection open and passing it around through different methods. Second, use the same connection string (and the same thread identity if you're using integrated authentication). If you don't use the same connection string, for example customizing the connection string based on the logged-in user, you won't get the same optimization value provided by connection pooling. And if you use integrated authentication while impersonating a large set of users, your pooling will also be much less effective. The .NET CLR data performance counters can be very useful when attempting to track down any performance issues that are related to connection pooling.
Whenever your application is connecting to a resource, such as a database, running in another process, you should optimize by focusing on the time spent connecting to the resource, the time spent sending or retrieving data, and the number of round-trips. Optimizing any kind of process hop in your application is the first place to start to achieve better performance.
The application tier contains the logic that connects to your data layer and transforms data into meaningful class instances and business processes. For example, in Community Server, this is where you populate a Forums or Threads collection, and apply business rules such as permissions; most importantly it is where the Caching logic is performed.
Tip 4—ASP.NET Cache API
One of the very first things you should do before writing a line of application code is architect the application tier to maximize and exploit the ASP.NET Cache feature.
If your components are running within an ASP.NET application, you simply need to include a reference to System.Web.dll in your application project. When you need access to the Cache, use the HttpRuntime.Cache property (the same object is also accessible through Page.Cache and HttpContext.Cache).
There are several rules for caching data. First, if data can be used more than once it's a good candidate for caching. Second, if data is general rather than specific to a given request or user, it's a great candidate for the cache. If the data is user- or request-specific, but is long lived, it can still be cached, but may not be used as frequently. Third, an often-overlooked rule is that sometimes you can cache too much. Generally on an x86 machine, you want to run a process with no higher than 800MB of private bytes in order to reduce the chance of an out-of-memory error. Therefore, caching should be bounded. In other words, you may be able to reuse a result of a computation, but if that computation takes 10 parameters, you might attempt to cache on 10 permutations, which will likely get you into trouble. One of the most common support calls for ASP.NET is out-of-memory errors caused by overcaching, especially of large datasets.
There are a several great features of the Cache that you need to know. The first is that the Cache implements a least-recently-used algorithm, allowing ASP.NET to force a Cache purge—automatically removing unused items from the Cache—if memory is running low. Secondly, the Cache supports expiration dependencies that can force invalidation. These include time, key, and file. Time is often used, but with ASP.NET 2.0 a new and more powerful invalidation type is being introduced: database cache invalidation. This refers to the automatic removal of entries in the cache when data in the database changes.
Tip 5—Per-Request Caching
In the Forums application of Community Server, each server control used on a page requires personalization data to determine which skin to use, the style sheet to use, as well as other personalization data. Some of this data can be cached for a long period of time, but some data, such as the skin to use for the controls, is fetched once on each request and reused multiple times during the execution of the request.
To accomplish per-request caching, use the ASP.NET HttpContext. An instance of HttpContext is created with every request and is accessible anywhere during that request from the HttpContext.Current property. The HttpContext class has a special Items collection property; objects and data added to this Items collection are cached only for the duration of the request. Just as you can use the Cache to store frequently accessed data, you can use HttpContext.Items to store data that you'll use only on a per-request basis. The logic behind this is simple: data is added to the HttpContext.Items collection when it doesn't exist, and on subsequent lookups the data found in HttpContext.Items is simply returned.
Tip 6—Background Processing
The path through your code should be as fast as possible, right? There may be times when you find yourself performing expensive tasks on each request or once every n requests. Sending out e-mails or parsing and validation of incoming data are just a few examples.
When tearing apart ASP.NET Forums 1.0 and rebuilding what became Community Server, we found that the code path for adding a new post was pretty slow. Each time a post was added, the application first needed to ensure that there were no duplicate posts, then it had to parse the post using a "badword" filter, parse the post for emoticons, tokenize and index the post, add the post to the moderation queue when required, validate attachments, and finally, once posted, send e-mail notifications out to any subscribers. Clearly, that's a lot of work.
It turns out that most of the time was spent in the indexing logic and sending e-mails. Indexing a post was a time-consuming operation, and it turned out that the built-in System.Web.Mail functionality would connect to an SMTP server and send the e-mails serially. As the number of subscribers to a particular post or topic area increased, it would take longer and longer to perform the AddPost function.
Indexing e-mail didn't need to happen on each request. Ideally, we wanted to batch this work together and index 25 posts at a time or send all the e-mails every five minutes.
The Timer class, found in the System.Threading namespace, is a wonderfully useful, but less well-known class in the .NET Framework, at least for Web developers. Once created, the Timer will invoke the specified callback on a thread from the ThreadPool at a configurable interval. This means you can set up code to execute without an incoming request to your ASP.NET application, an ideal situation for background processing. You can do work such as indexing or sending e-mail in this background process too.
There are a couple of problems with this technique, though. If your application domain unloads, the timer instance will stop firing its events. In addition, since the CLR has a hard gate on the number of threads per process, you can get into a situation on a heavily loaded server where timers may not have threads to complete on and can be somewhat delayed. ASP.NET tries to minimize the chances of this happening by reserving a certain number of free threads in the process and only using a portion of the total threads for request processing. However, if you have lots of asynchronous work, this can be an issue.
Tip 7—Page Output Caching and Proxy Servers
ASP.NET is your presentation layer (or should be); it consists of pages, user controls, server controls (HttpHandlers and HttpModules), and the content that they generate. If you have an ASP.NET page that generates output, whether HTML, XML, images, or any other data, and you run this code on each request and it generates the same output, you have a great candidate for page output caching.
By simply adding this line to the top of your page
<%@ Page OutputCache VaryByParams="none" Duration="60" %>
you can effectively generate the output for this page once and reuse it multiple times for up to 60 seconds, at which point the page will re-execute and the output will once be again added to the ASP.NET Cache. This behavior can also be accomplished using some lower-level programmatic APIs, too. There are several configurable settings for output caching, such as the VaryByParams attribute just described. VaryByParams just happens to be required, but allows you to specify the HTTP GET or HTTP POST parameters to vary the cache entries. For example, default.aspx?Report=1 or default.aspx?Report=2 could be output-cached by simply setting VaryByParam="Report". Additional parameters can be named by specifying a semicolon-separated list.
Many people don't realize that when the Output Cache is used, the ASP.NET page also generates a set of HTTP headers that downstream caching servers, such as those used by the Microsoft Internet Security and Acceleration Server or by Akamai. When HTTP Cache headers are set, the documents can be cached on these network resources, and client requests can be satisfied without having to go back to the origin server.
Using page output caching, then, does not make your application more efficient, but it can potentially reduce the load on your server as downstream caching technology caches documents. Of course, this can only be anonymous content; once it's downstream, you won't see the requests anymore and can't perform authentication to prevent access to it.
Tip 8—Run IIS 6.0 (If Only for Kernel Caching)
If you're not running IIS 6.0 (Windows Server™ 2003), you're missing out on some great performance enhancements in the Microsoft Web server. In Tip 7, I talked about output caching. In IIS 5.0, a request comes through IIS and then to ASP.NET. When caching is involved, an HttpModule in ASP.NET receives the request, and returns the contents from the Cache.
If you're using IIS 6.0, there is a nice little feature called kernel caching that doesn't require any code changes to ASP.NET. When a request is output-cached by ASP.NET, the IIS kernel cache receives a copy of the cached data. When a request comes from the network driver, a kernel-level driver (no context switch to user mode) receives the request, and if cached, flushes the cached data to the response, and completes execution. This means that when you use kernel-mode caching with IIS and ASP.NET output caching, you'll see unbelievable performance results. At one point during the Visual Studio 2005 development of ASP.NET, I was the program manager responsible for ASP.NET performance. The developers did the magic, but I saw all the reports on a daily basis. The kernel mode caching results were always the most interesting. The common characteristic was network saturation by requests/responses and IIS running at about five percent CPU utilization. It was amazing! There are certainly other reasons for using IIS 6.0, but kernel mode caching is an obvious one.
Tip 9—Use Gzip Compression
While not necessarily a server performance tip (since you might see CPU utilization go up), using gzip compression can decrease the number of bytes sent by your server. This gives the perception of faster pages and also cuts down on bandwidth usage. Depending on the data sent, how well it can be compressed, and whether the client browsers support it (IIS will only send gzip compressed content to clients that support gzip compression, such as Internet Explorer 6.0 and Firefox), your server can serve more requests per second. In fact, just about any time you can decrease the amount of data returned, you will increase requests per second.
The good news is that gzip compression is built into IIS 6.0 and is much better than the gzip compression used in IIS 5.0. Unfortunately, when attempting to turn on gzip compression in IIS 6.0, you may not be able to locate the setting on the properties dialog in IIS. The IIS team built awesome gzip capabilities into the server, but neglected to include an administrative UI for enabling it. To enable gzip compression, you have to spelunk into the innards of the XML configuration settings of IIS 6.0.
Tip 10—Server Control View State
View state is a fancy name for ASP.NET storing some state data in a hidden input field inside the generated page. When the page is posted back to the server, the server can parse, validate, and apply this view state data back to the page's tree of controls. View state is a very powerful capability since it allows state to be persisted with the client and it requires no cookies or server memory to save this state. Many ASP.NET server controls use view state to persist settings made during interactions with elements on the page, for example, saving the current page that is being displayed when paging through data.
There are a number of drawbacks to the use of view state, however. First of all, it increases the total payload of the page both when served and when requested. There is also an additional overhead incurred when serializing or deserializing view state data that is posted back to the server. Lastly, view state increases the memory allocations on the server.
Several server controls, the most well known of which is the DataGrid, tend to make excessive use of view state, even in cases where it is not needed. The default behavior of the ViewState property is enabled, but if you don't need it, you can turn it off at the control or page level. Within a control, you simply set the EnableViewState property to false, or you can set it globally within the page using this setting:
<%@ Page EnableViewState="false" %>
If you are not doing postbacks in a page or are always regenerating the controls on a page on each request, you should disable view state at the page level.
C Programming Tips
This document is a collection of useful lessons about C programming .It is not meant to serve as a tutorial for the language (many good free tutorials exist online) or to nit-pick on syntax. Most tips are widely applicable for C development on any platform and may even apply to programming in other languages.
For quick hacks and small projects, high-level interpreted languages like Python and Ruby allow you to get something up and running much faster with less fuss. For more structured software engineering-like work, languages like Java and C# offer better type checking, more natural support for data abstractions, and more powerful standard libraries. For web programming, there's PHP, XSLT, and loads of other acronyms.
Many people (myself included) program in C because they need to interface with libraries and other programs that are written in C. For all of its lack of safety, lack of support for high-level abstractions, and the horrendously gross things you can do with it, C is still one of the most widely used programming languages in the world. It is used to build all types of software ranging from low-level device drivers to graphically intensive desktop applications. Some of the most sophisticated pieces of software (e.g., operating systems, web servers, compilers, networking tools) are written in C. Much of the enormous body of free and open-source software (e.g., the GNU project) is written in C.
If you want to build on top of these existing code bases, or if you need low-level close-to-the-hardware functionality, then it is usually easier to program in C than to use another language. However, before starting to program in C, ask yourself whether it is the best language for the job that you are trying to accomplish. You should always choose the language that operates on the highest-level of abstraction that is adequate for your task. Since C is a fairly low-level language, you should probably not use it unless the alternatives are all sub-optimal.
The following tips address the problem that bugs in C programs can often go unnoticed for a long time during execution from when the bug occurs to when the actual error surfaces. Often times, the manifested errors may have little to do with the original bug that caused that error, which can be extremely frustrating to debug. You need to be vigilant about ensuring that your bugs are caught as soon as possible during execution. The C compiler and runtime system will not help you out much; the most useful error message it can give you is the infamous Segmentation fault. Of course, you can always run your program through a debugger and get a backtrace to observe its state at the time of the crash, but the problem with this is that it may be way too late to tell what caused the crash. You want to have your program crash as early as possible if there is a problem in order to not let bugs propagate.
Use assert statements to document and enforce function pre-conditions and other assumptions
The basic unit of abstraction in most C programs is the function. Many non-trivial functions take pointers to data structures as arguments and often mutate them in sophisticated ways. In order to perform the intended task correctly, functions expect their arguments to have certain properties (e.g., this pointer is non-null, this integer is within a certain range). These are called the pre-conditions of the function. In addition, during certain points within a function (such as within a branch of an if statement), certain assumptions must hold true. You should always write down these assumptions as comments within your code.
However, a more powerful way to document these assumptions in addition to writing comments is to include them directly in the code as assert statements. An assert statement takes an expression, evaluates it, and if it is false, aborts the program with an error message stating where in the source code the assertion failed. This simple construct, when used aggressively, can help track down many bugs and also serve as valuable documentation. Use an assert statement whenever you make any non-trivial assumption about some part of your program. The immense power of an assert is that it allows the programmer to catch bugs early before they propagate to other parts of the program and cause weird crashes or memory corruption (which is very easy in C programs if you are not careful). It also serves as documentation that is often better than comments because it actually compiles and executes like the rest of your code. Don't ever feel hesitant to include assert statements because it might 'slow down' your code slightly (you will probably not notice the slowdowns); the benefits of peace-of-mind and improved bug-finding capabilities are far more valuable.
Non-trivial C programs often operate on data structures that contain integers, strings, and pointers to other data structures. There are certain properties called rep. (representation) invariants that must be true about a particular data structure in order for it to be 'well-formed' (i.e., to conform to its specs). Functions that operate on these data structures often assume that they are 'well-formed' or else the code that operates on them may be useless. Like function pre-conditions, programmers should write down representation invariants in comments. This may be trivial for simple data structures, but can be a difficult task for sophisticated ones which include pointers to other data structures.
A better idea is to actually encode these rep. invariants as a series of assert statements placed within a rep. check (representation check) function. Whenever it is convenient, insert calls to the rep. check functions for the appropriate data structures. Once a particular instance passes a rep. check call, you know, for the moment at least, that your assumptions about it hold true. You can think of a rep. check as an application of assert to protect assumptions about data structures rather than about functions. It is crucial to report errors in data structures as early as possible, because a corrupted data structure can continue working fine for a long time until it crashes in some bizarre way in some distant part of the program far from where it was initialized. The combination of applying assert statements on functions and data structures has helped me to catch countless bugs in my code that would have been much more difficult to debug otherwise.
If you declare a local variable, it is not initialized to anything, and if you allocate a new block of heap memory using malloc(), that is also uninitialized. Uninitialized (garbage) data is never useful, so make sure that none of your data is ever uninitialized. Be vigilant about always writing int foo = 0 instead of int foo and worrying about it later (because you may forget to initialize it on some code path). Instead of malloc(), use calloc() to allocate a block of memory and initialize it to 0. Sure, it takes several more instructions to initialize data to 0, but we're not in the 1970's anymore! A miniscule gain in performance is never an excuse to increase the possibility of introducing nasty bugs into your program.
Sometimes it is tempting to have a function return null without doing anything if a certain pre-condition isn't satisfied (i.e., with something like if (foo->bar < 0) return;). After all, the entire program shouldn't abort just because the function sees an input that is inappropriate, right? I disagree. I think that you should rarely have a function return without doing anything. Turn these conditions into asserts so that the function fails with a bang when it sees invalid input. Why are you even passing invalid inputs into the function in the first place? If many of your functions fail silently, then bugs can go un-noticed and surface at really bad times. In general, you need to keep the tightest net you can around your code to make sure that all bugs manifest themselves as early as possible.
Pointers are one of those things that are simple to define (a value that holds a memory address) but can cause beginner C programmers endless headaches. Pointers are unavoidable in non-trivial C programs because C only supports passing function parameters by value, so there is no way for a function to mutate its arguments except if you pass in pointers. In general, use pointers and dynamic memory allocation as little as you need to (favor local variables because the stack is automatically 'garbage collected' when a function exits), but keep these tips in mind when you do need to use them:
It is a common misconception that pointers are always used with dynamically-allocated memory. A pointer is simply a value that holds a memory address. It could hold the address of a global variable, a local variable on the stack, or a dynamically-allocated value on the heap. Because values in the heap do not have names, the only way to refer to them is using pointers. However, many pointers point to global or local variables, whose addresses can be acquired using the & operator. So whenever you see a pointer, don't automatically assume that it refers to a dynamically-allocated value.
The scourge of not having a garbage collector is that you have the burden of manually freeing all dynamically-allocated memory. In concept, manual memory management is really simple: Every call to malloc() of a memory block should be followed later by a matching call to free(). Of course, in practice this can be extremely hard to guarantee. The main reason for this difficulty is that the call to malloc() may reside in some completely unrelated part of your code or even in libraries. You often allocate memory somewhere and pass around a pointer to it to many different functions before you need to free it.
In your own code, make sure that you understand where every pointer comes from and whether it needs to be freed later (remember that pointers to global and stack areas never need to be freed). If you call a library function, you should look at its API and find out whether it dynamically allocates any memory that you may have to free later (e.g., strdup() for duplicating strings).
Whenever you call free() to free the memory referred-to by a pointer, always set the pointer to 0 right away. The data referred-to by freed pointers does not get erased, so it is possible later in your program to use that freed pointer to read that data back, although that is very dangerous. After a block of memory has been freed, the C library can re-assign it to another pointer at anytime, so the freed pointer should never be used to refer to that block. The best way to keep this guarantee is to set a pointer to 0 after calling free() on it.
Before you call free() on a pointer, ask yourself whether there are any other pointers that refer to the same memory location. If so, either don't call free() or set all of those other pointers to 0 as well as the freed pointer. One of the worst kinds of C bugs (memory corruption) occurs when you have two or more pointers referring to one location (called aliasing) and you call free() on one of the pointers to free up that location. Even if you set that pointer to 0 immediately, there are still other pointers that refer to that location. Because the memory does not get re-assigned right away, the program can still use those other pointers to read back valid data. However, there is no guarantee of when that memory will get clobbered with new data, and when it does, your program will either crash (hope for that) or worse, nefariously do something incorrect and propagate a bug to some other part before surfacing it.
These tips are stylistic because they have no impact on your code's behavior. However, they can greatly help improve the organization and readability of your code, thus making it easier for you to find bugs.
When programming in any language, you should make sure that program modules are as isolated and self-contained as possible. Unfortunately, C doesn't provide any ways to enforce modularity besides for files. You need to use files to form strict module boundaries. Declare all functions as static (only visible within the file) unless they absolutely need to be called from functions in other files. Keep non-static functions to a minimum; the narrower the interface between different modules (files), the less you will hopefully have to debug. The same goes for global variables. Declare them static unless they need to be accessed by functions in other programs. Global variables make it extremely difficult to reason about program behavior because they destroy locality (just ask proponents of functional programming). Local variables should be declared at the top of the smallest enclosing block where they will be used. Do not get lazy and declare all local variables at the top of a function, or even worse, re-use the same local variable for many different tasks
I won't go into a diatribe about the C preprocessor (it can be beneficial if used sparingly when there are no other easy alternatives), but don't use preprocessor #define statements when you can use an enumeration instead. For example,
#define V_TRUCK 1
#define V_CAR 2
#define V_BIKE 3
is bad because the compiler has no clue that these three constants are related to one another. After pre-processing, all the compiler sees are numeric literals (oh yeah, don't use those either!). A better approach is to use an enum
typedef enum {V_TRUCK = 1, V_CAR, V_BIKE} Vehicle; because now the compiler knows that V_TRUCK, V_CAR, and V_BIKE all belong to the type Vehicle. C has fairly weak type checking in general, but it can at least type check enums and give you warnings, which is better than no checks if you use #define macros.
You should never use a numeric literal in your code unless it truly stands for a number and cannot be more easily expressed in some other expression. The most common case of this is the use of numbers to indicate variable sizes when calling memory allocator functions. For example, if you want to allocate an array of 4 integers (initialized to 0), the following 3 statements are identical (on a 32-bit machine like the x86):
1. int* foo = (int*)calloc(4, 4); // Huh? What does 4 mean?
2. int* foo = (int*)calloc(4, sizeof(int)); // good ...
3. int* foo = (int*)calloc(4, sizeof(*foo)); // but even better
Version 1 is bad because it uses the number 4 to stand for the size of an int when an expression like sizeof(int) in version 2 is clearer, less error-prone (what if you forget that an int is 4 bytes on an x86), and more portable (what if you switch to a different architecture?). However, I prefer version 3, which expresses the size in terms of the actual pointer variable foo. sizeof(*foo) returns the size of whatever foo refers to, which in this case, is an int. This is the most robust solution because if you later change foo to a different type, then you don't have to change the sizeof expression at all. Remember that sizeof is resolved at compile-time, so using it does not incur any run-time overhead over simply using a number. Also, don't be afraid to add the results of a few sizeof expressions together in your code rather than trying to do the math yourself. Constant folding performed by the compiler will resolve those operations and produce one number in the object code.
Your (non-trivial) program will always have bugs, so it is never too early to learn to debug. Here are some useful strategies:
Many programmers scoff at the idea of using print statements as only something that newbies do. However, I think that they are very valuable in giving you an overall feel of program execution. Remember to always end a debugging print statement with an endline character ('\n') because many implementations of printf() perform buffering but are forced to write the buffer to the terminal when it encounters an endline character. If you don't use endlines in your print statements, then sometimes you may think that a particular part of your code hasn't been reached before your program crashed (since nothing was printed out there), when in fact it was reached but your statement never printed in time due to buffering.
If your program crashes, you can run it through a debugger and re-construct its execution state at the time of the crash, but often you want to know what events led up to the crash. You could step through your program from the beginning of execution, but that is extremely tedious and slow. What I do is sprinkle print statements throughout my program, observe the output to see what gets printed before the crash, and repeat until I gain a good sense of where my problem may be. Then I can fire up a debugger and set a breakpoint in the vicinity of that area.
Let's say that when you're trying to debug, you only want to stop at a particular function when some complex condition has been met. Ordinarily, you would set a conditional breakpoint in the debugger. You probably need to set the conditions again every time you restart the debugger, which can be annoying. Instead, a better idea is to set the conditional in the program itself like so:
100 void foo(int a, int b, int c) {101 ... blah ...
120 if (((a % 3) == 0) ||
121 ((b < (c * 2)) && ((c + a) < 0))) {122 printf("BREAK!!!");123 }
150 ... bleh ...
151 }
Now all you have to do is to set a regular (unconditional) breakpoint on line 122 (the line with the print statement). This line should only be executed if your condition passes. You now have a conditional breakpoint without messing with the debugger at all. This can often be a much better alternative than setting complex conditions within the debugger because you can include arbitrary function calls in those conditions, which may be difficult or impossible to do within the debugger.
Use watchpoints in a debugger like GDB to diagnose memory corruption errors, one of the nastiest and hardest-to-find types of bugs in a memory-unsafe language like C. Memory corruption can occur when some data structure in your program retains a pointer to a region of memory that is freed without its knowledge (via an aliased pointer), and then some other part of your program re-allocates that memory to be used for another purpose. You should suspect memory corruption whenever you find that there is some bug where a value is valid at a particular time but jumbled at another time, and that the time when the corruption occurs varies across different runs.
To fix these bugs, you first need to find the line of code that causes the value to be clobbered with junk. To do so, go into GDB and step to a line of the code where the program accesses the data that will later be corrupted. Set a watchpoint on the expression that you used to access the data using watch foo, where foo is the expression that you want GDB to watch. Now continue to execute your program normally, and GDB will pause it when the contents of foo first changes and give you the line of code that caused the change. Often times, this is enough information for you to realize what you need to do to fix your bug. Note that all a watchpoint does is stop the program when the value contained in the contents of the expression it is watching changes, so to prevent false alarms, you want to set the watchpoint at a time when you no longer expect that data to change ... but of course, it will change due to the memory corruption, which is precisely the change that you want the watchpoint to catch.
Don't wait until you've finished your entire project to start testing it. Start as early as you can, and build up your test suite as you develop. Testing is tedious, but it is also one of the best ways to uncover bugs. Everybody has their own views on testing, so my only advice here is that you shouldn't ignore it. If you can automate your tests, then you will be encouraged to run them more often during development, which will help you to catch more bugs. As your program grows more complicated, it becomes harder and harder to determine whether a change in one part will have weird side-effects on some other part of the program. If you have a good test suite, running it after your change and seeing that there were no diffs can give you some confidence (although no guarantees) that your change did not have un-intended side-effects.
Testing is like flossing your teeth. Everyone know that it's a good thing, but most people don't do it often enough. People remember the tedious hours they had to spend on their school programming assignments writing boring and trivial unit tests for functions that, well, perform boring and trivial tasks.
All of this overhead of setting up data structures can be eliminated when writing system test. For a particular function, its calling function usually sets up the data structure state properly (or the caller of the caller does, etc...). All a system test needs as input is the input to the system itself. All a system test needs to verify as output is the output of the system. All of the overhead of setting up intermediate levels of infrastructure can be eliminated. The best aspect of a system test is that when it passes, you know that your entire system (or sub-system) works, not just a small portion of it. Being able to run and test a fully-functional (albeit buggy) system is a whole lot more encouraging and useful than unit testing small parts of it.
For example, for the program analysis tools that I am building, the input is a C program and the output is a text file trace of the program's data structures during execution. I can manually verify that a trace is correct, and then use it as my 'golden file' to compare against the traces produced by subsequent runs of my tool (i.e. using diff). If there are diffs, then I can immediately see what differs, which can help direct me to a particular part of the program that is troubling. I can either try to debug it directly or write some more specialized tests (maybe unit tests) that specifically exercise that part of the code.
System tests are especially powerful when combined with heavy use of assert statements throughout your program to enforce conditions such as data structure rep. invariants and function pre-conditions. As you write more system tests, you will achieve better code coverage, and if something goes wrong, chances are that some assertion will fail. You can view the tests as ways to 'tease' the assert statements by executing them over and over again with different states. The asserts will catch the vast majority of the aforementioned structural problems, and manually verifying the results of the system tests will catch semantic problems (i.e. is the program doing the right thing for this particular input?) that cannot easily be checked by assert statements.
Turn on your compiler warning levels as high as you feel comfortable doing (if you use GCC, see the various -W options in the manual, most notably -Wall for turning on most warnings), and then look at all the warnings and try to address them as though they were errors. The C standard doesn't provide very tight constraints for what code can legally compile, so you can get away with running fairly atrocious code. For example, once I forgot to include a return statement in a function that returns an integer, thus causing the program to return whatever junk value was in the EAX register. This caused some subtle bug that took me a long time to detect. Even if a compiler warning seems senseless, look at that line of code because it might be the indication of some other error elsewhere. Because you are programming in C, you already lose lots of the compile-time safety checks available in higher-level languages, so you must be vigilant to ensure that you don't allow code that has blatant syntatic bugs to compile and run.
Top ADO Tips
Connecting to data stores
How do I connect to a MS Access 2000 database?
ADOConnection.ConnectionString := 'Provider=Microsoft.Jet.OLEDB.4.0;DataSource=C:\MyDatabase.mdb;Persist Security Info=False';
How do I connect to a password protected MS Access 2000 database?
ADOConnection.ConnectionString := 'Provider=Microsoft.Jet.OLEDB.4.0;Jet OLEDB:Database Password=XXXXXX;DataSource=C:\MyDatabase.mdb;Persist Security Info=False';
What provider should I use for MS Access
For MS Access 97 use Microsoft.Jet.OLEDB.3.51
For MS Access 2000 use Microsoft.Jet.OLEDB.4.0
How do I connect to a dBase database?
ADOConnection.ConnectionString := 'Provider=Microsoft.Jet.OLEDB.4.0;DataSource=C:\MyDatabase.mdb;Extended Properties="dBase 5.0;"';
How do I connect to a Paradox database?
ADOConnection.ConnectionString := 'Provider=Microsoft.Jet.OLEDB.4.0;DataSource=C:\MyDatabase.mdb;Extended Properties="Paradox 7.X;"';
How do I connect to a MS Access database on a CD (read only) drive?
ADOConnection.Mode := cmShareExclusive;
Data retrieving and manipulation
How do I use multiword table / field names (spaces in Table or Field name)?
Enclose multiword names in [ ] brackets:
ADOQuery1.SQL.Text := 'SELECT [Last Name], [First Name] FROM [Address Book]';
How do I use constant fields in an SQL query?
ADOQuery1.SQL.Text := 'SELECT ''2002'', [First Name], Salary FROM Employess';
How do I delete all records in a table?
ADOQuery1.SQL.Text := 'DELETE * FROM TableName';
Why do I keep getting a "-1" for the RecordCount property
If you need the RecordCount to be correct, set the CursorType to something other than ctOpenForwardOnly.
I'm using AutoNumber for Primary Key field to make every record unique. If I want to read or Edit some ADOTable record after one was appended (and Post-ed) I get en error: "The specified row could not be located for updating. Some values may have been changed since it was last read". Why?
After every new record you should use:
var bok: TBookmarkStr;
begin
bok:=adotable1.Bookmark;
adotable1.Requery();
adotable1.Bookmark:=bok;
end;
How do I create a disconnected ADO recordset? I want to run a query, pick the data and delete some records but not physically.
In order to create a disconnected ADO recordset, you must first set the ADODataSets CursorLocation property to "clUseClient". Then open the Recordset. Then set the ADODatasets Connection to Nil. Do not close the ADODataset.
How do I retrieve a system information, for example, list of tables, fields (columns), indexes from the database?
TADOConnection object has an OpenSchema method that retrieves system information like list of tables, list of columns, list of data types and so on. The following example shows how to fill an ADODataSet (DS) with a list of all indexes on a table (TableName):
var DS:TADODataSet;
...
ADOConnection.OpenSchema(siIndexes, VarArrayOf([Unassigned, Unassigned, Unassigned, Unassigned, TableName]), EmptyParam, DS);
How can I improve the performance of my Ado application (like speed up query data retrieval)?
. Avoid returning too many fields. ADO performance suffers as a larger number of fields are returned. For example using "SELECT * FROM TableName" when TableName has 40 fields, and you really need only 2 or 3 fields
. Choose your cursor location, cursor type, and lock type with care. There is no single cursor type you should always use. Your choice of cursor type would depend on the functionality you want like updatability, cursor membership, visibility and scrollability. Opening a keyset cursor may take time for building the key information if you have a lot of rows in the table whereas opening a dynamic cursor is much faster.
. Release your dynamically created ADO objects ASAP.
. Check your SQL expression: when joining tables with Where t1.f1 = t2.f1 and t2.f2 = t2.f2 it is important that f1 and f2 as fields are set to be indexed.