# Tips to make your program more efficient

Discussion in 'C' started by asadullah.ansari, Jan 9, 2008.

Joined:
Jan 9, 2008
Messages:
356
14
Trophy Points:
0
Occupation:
Developer
Location:
NOIDA
How to make your program to be fast.

1. Use Pre Increment or Pre Decrement in place of Post Increment or Post Decrement operator.

i.e. use ++i or --i in place of i++ or i--.

Because ++i operator take increment and save it i.e. One CPU Instruction is required for it. and in case of i++ first save the value of i in register and then Increment, i.e. It takes two instruction set.

So My dear just imagine If you have a loop of 10000 . Then you can save 10000 CPU Instruction to be waste.

PRACTICE: Make Practice to write simple Loop as
Code:
`for( i=0; i<n; ++i)`
in place of
Code:
`for( i=0;i<n;  i++)`
2. Use either for( i=n; i>=0;--i) or for( i=0;i<=n;++i)

Depends on CPU which are you using. Just check CPU Ticks in both cases which for loop is taking more time. mostly CPU give more efficiently on for( i=n; i>=0;--i)

3. Never use function call inside for loop. Make it redesign to be a for loop inside one function.

Example: In place of
for( i=n;i<=0;--i)

use
Code:
```Fun4Add(int num)
{
sum+=num;
}```
Because No Overhead of creating stack frame, saving register etc in every loop and don't declare a variable in a loop and avoid to put any expression in a loop If that expression can be without loop with same output.

4. Don't Use Math library function pow(num1,num2) if num1 is multiple of 2. Because POW() library function is very costly. You can imagine about this arithmetic operation.

If num1==2 then use " Left shift Operator" num1< num2 . It is 100 times fast that Pow() library Function.

5. Always Declare any multi dimensions Array with the power Of 2.

example
int arr[2][1024] in place of int arr[1][1023] etc
int arr[32][128] in place of int arr[30][128] etc.

Because Reason is same as i have given in 4.

Joined:
Jul 12, 2004
Messages:
15,292
365
Trophy Points:
83
I don't agree on your first point
Code:
```int main(int argc, char* argv[])
{
time_t t1,t2;
long n = 1000000000;
int i = 0;
(void) time(&t1);
for( i = 0; i < n;  ++i);
(void) time(&t2);
printf("time for for( i = 0; i < n;  ++i); = %f\n", (double) ((double)t2-(double)t1));
(void) time(&t1);
for( i = 0; i < n;  i++);
(void) time(&t2);
printf("time for for( i = 0; i < n;  i++); = %f\n", (double) ((double)t2-(double)t1));
return 0;
}```
Does not give consistent output. Try running the above program multiple times and you will see that it toggles between which time is more.

With Point 3 it always the necessity to be doing so.

I guess if you are writing a thousand lines of codes then it hardly matter couple of CPU cycles and practically you don't get a loop for even greater than 1000 or else you should design your code a bit better.

Joined:
Apr 4, 2005
Messages:
1,646
86
Trophy Points:
0
Occupation:
Programmer
Location:
Kolkata, India
Why does the time taken toggles???

4. ### SalemNew Member

Joined:
Nov 15, 2007
Messages:
133
5
Trophy Points:
0
Occupation:
Please don't PM me for 1:1 support.
> i.e. use ++i or --i in place of i++ or i--.
In the context of a for loop, where you're just using the operator for it's side effect, I've never seen it make a bean of difference whether you write
i++
++i
i += 1
i = i + 1
Any decent compiler will easily recognise all those things as meaning the same thing, and generate exactly the same code.
Study the output of "gcc -S prog.c" (result in prog.s) to view the generated assembler.

You're dealing with a compiler, not a programmable calculator. It doesn't perform a literal translation of each character you type into a machine instruction. If the old result of say i++ is never referenced, then it simply won't exist in the executable program.

> 5. Always Declare any multi dimensions Array with the power Of 2.
Why, what about all that wasted space?

> Because Reason is same as i have given in 4.
Except array subscripting is based on multiplication, not raising 2 to the power of something.

Modern compilers like gcc will optimise a great many more numbers as efficient shift/add combinations than the obvious powers of 2 which you seem to like.

This all smacks of premature optimisation disease.
You've got to make it "right" before you can make it "fast". A bug-ridden program which is delivered late is no use to anyone, no matter how fast it is.

Besides which, you need a working program to test with meaningful data if you're to get anything useful at all out of profiling. Profiling will tell you where the real hot spots are, not your guesses based on some hearsay evidence from some web page. Many (if not all) of the old tricks simply don't apply to today's optimising compilers.

Try this if you want to find out more.
http://leto.net/docs/C-optimization.php

shabbir likes this.
5. ### oogaboogaNew Member

Joined:
Jan 9, 2008
Messages:
115
11
Trophy Points:
0
All compilers do simple optimizations that should make such
purely syntactic differences meaningless at any rate.
However, there is in general no performance difference between
pre and post inc or dec. Using ++i or i++, where it otherwise
doesn't matter (in the greater expression, if any), is purely philosophical.
Consider the following disassembly.
Code:
```int main() {
int i, j, n = 0;
for (i = 0; i < 10; i++) n++;
for (j = 0; j < 10; ++j) ++n;
}

_main:
...
movl	\$0, -12(%ebp)    ## n = 0
movl	\$0, -4(%ebp)     ## i = 0
L2:
cmpl	\$9, -4(%ebp)     ## compare i to 9
jle	L5                   ## goto L5 if <=
jmp	L3                   ## else goto L3
L5:
leal	-12(%ebp), %eax  ## load n
incl	(%eax)           ## inc n
leal	-4(%ebp), %eax   ## load i
incl	(%eax)           ## inc i
jmp	L2
L3:
movl	\$0, -8(%ebp)     ## j = 0
L6:
cmpl	\$9, -8(%ebp)     ## compare j to 9
jle	L9                   ## goto L9 if <=
jmp	L7                   ## else goto L7
L9:
leal	-12(%ebp), %eax  ## load n
incl	(%eax)           ## inc n
leal	-8(%ebp), %eax   ## load j
incl	(%eax)           ## inc j
jmp	L6
L7:
leave
ret
```

shabbir likes this.

Joined:
Jan 9, 2008
Messages:
356
14
Trophy Points:
0
Occupation:
Developer
Location:
NOIDA
Hi oogabooga!!! Just think about (++User data) and (user data ++) in place of Inbuilt Data type.
and just check correct code of assembly generated by Compiler.

First thing dont leave anything that Compiler will do. But why? Better we made it as habit. Now what is happening in Microsoft Technology, No need to learn about any things. Just How to use and use it. Microsoft philosphy is that " Making less Man Power In industry, Dont spend time to learn"

7. ### oogaboogaNew Member

Joined:
Jan 9, 2008
Messages:
115
11
Trophy Points:
0
That is a good point, asadullah.ansari. But your original post seemed to be about built-in types. The "reason" given seemed to be assembly code differences, not actual user code differences.

But of course in a user type, a post-inc is defined something like:
Code:
```User User::operator++( int ) { /* Variable name not needed!! */
User tmp = *this;
member++;     /* Whatever increment semantics you need */
return tmp;
}```
In that case, a post-inc is potentially ridiculously slower. But if it is your own class, then you will be aware of that. If it is a library object, however, you should be aware of the potential cost, which, if significant, should be in the class documentation.

As for the original post generally, I agree with Salem that most of this is best left to the compiler until it is proved necessary. If your loop calls a function, so what? How much time is really wasted?
It could be anywhere from unnoticable to intolerable since the more work the function actually does, the less significant the overhead as a fraction of the total cost.

Compilers are not (entirely) stupid. They know that a function call has overhead and balance that against the length of the function and potentially inline it.

In point 4 I assume you meant "power", not "multiple", of 2. I can't imagine anyone using pow to calculate powers of 2, but clearly bitshifting is much faster.

Point 5 is hideously incorrect! The most important aspect of program speed is size (of program and especially data). If more can fit into the CPU cache, it will run faster. Besides, array offsets are calculated using multiplication, not powers.

Last edited: Jan 21, 2008

Joined:
Jan 9, 2008
Messages:
356
14
Trophy Points:
0
Occupation:
Developer
Location:
NOIDA
>> It should be habit. By any way it's good Only. You know a lot scientific & rresearch product based on computing. There mili seconds also matter.

>>> In real time i fix a bug in a product " Statistical based product" related to that.
You can'nt imagine, If That functions have a lot of arguments. If this function is called inside a loop suppose million times. Then you can see differences. In Statistical data analysis product Millions times loop is normal.

Joined:
Jan 9, 2008
Messages:
356
14
Trophy Points:
0
Occupation:
Developer
Location:
NOIDA
>> It's not like that. People forgot to use. Because many calculations based on power on doubles, there using pow function If in these between power of 2 is required, then normally people forgot to use bit shift.
So just it's care .>>>>

>>>Offcourse man If you will fijt more data at a time into catch it will be fast. But why you are raising this point here. No logic to raise this idea here.
You point is in my post: Fast Algorithm for matrix multiplication: Block Algorithm

10. ### sandymalhotraNew Member

Joined:
Feb 6, 2008
Messages:
9
1
Trophy Points:
0
For Me Robustness is more important than efficient!!!

11. ### imrantechiNew Member

Joined:
Feb 12, 2008
Messages:
116
4
Trophy Points:
0
i feel your 4th point really helps a lot

12. ### crazytolearn57New Member

Joined:
Feb 14, 2008
Messages:
48
0
Trophy Points:
0
some are worth

13. ### aisha.ansari84New Member

Joined:
Feb 13, 2008
Messages:
82
1
Trophy Points:
0

14. ### alrameshNew Member

Joined:
Feb 5, 2008
Messages:
22
0
Trophy Points:
0
For me Processor is too much low cosly so i did'nt think about speed only robustnes..

Joined:
Feb 13, 2008
Messages:
103