[m-users.] Question about any difference in efficiency on this code...

Sean Charles (emacstheviking) objitsu at gmail.com
Sat Aug 26 20:48:41 AEST 2023


That's pretty much what I thought the answer might be Zoltan, so the rule of premature optimisation wins again!

As you rightly say, if one suspects a performance issue, then some profiling is the way forward, but until that time... happy to keep bashing out the Mercury code!

Thanks,
Sean


> On 26 Aug 2023, at 11:45, Zoltan Somogyi <zoltan.somogyi at runbox.com> wrote:
> 
> 
> On 2023-08-26 20:12 +10:00 AEST, "Sean Charles (emacstheviking)" <objitsu at gmail.com> wrote:
>> I started out with this:
>> 
>>    get_random_value(0, 2, V, !IO),
>>    ( if V = 0 then
>>        Speed = 0.25, Color = color(gray)
>>    else if V = 1 then
>>        Speed = 0.75, Color = color(skyblue)
>>    else
>>        Speed = 1.25, Color = color(beige)
>>    ),
>>    Star = star(X, StarY, Speed, to_rgba(Color)).
>> 
>> end then, for some half0baked reason regarding not creating Speed and Color but instead directly returning Star...
>> 
>>    get_random_value(0, 2, V, !IO),
>>    ( if V = 0 then
>>        Star = star(X, StarY, 0.25, to_rgba(color(gray)))
>>    else if V = 1 then
>>        Star = star(X, StarY, 0.75, to_rgba(color(skyblue)))
>>    else
>>        Star = star(X, StarY, 1.25, to_rgba(color(beige)))
>>    ).
>> 
>> 
>> So, is there any real difference, or did I do something good / bad / indifferent at best?
> 
> There may be a difference in the performance of those two pieces of code,
> but any effect will be quite small; I would be surprised if it were more than
> half a percent. On my laptop, I cannot reliably measure differences that small,
> because the hardware's mechanisms for raising and lowering the CPU frequency
> (to keep the CPU's heat dissipation within the required limits) have a bigger
> and effectively random effect.
> 
> That difference is not worth worrying about in a user program unless profiling
> indicates the predicate to be a bottleneck. It can be worth worrying about
> in a compiler, because once the transformation from the usually-slightly-slower form
> to the usually-slightly-faster one is implemented (and it isn't hard), there is
> no point in not invoking it. In fact, the Mercury compiler does have such a transformation,
> which would be invoked for the top code if both start and to_rgba are function symbols,
> as opposed to executable functions. (The comment at the top of compiler/follow_code.m
> explains its rationale.)
> 
> Note that I say *usually* faster or slower. This is because this transformation changes
> the size of the code, in that the second form above replaces one copy of the code
> that computes Star with three copies. By changing which parts of the program
> collide in the instruction cache with which other parts, this can change the effectiveness
> of the instruction cache. The direction and size of this effect cannot be predicted
> by any reasonable algorithm in the Mercury compiler, because (a) Mercury generates
> not machine code but e.g. C code, so only the target language compiler (such as gcc)
> knows the sizes of the instructions it selects, and (b) even for a single target ABI, the
> CPUs implementing that ABI may, and almost always will, differ in the size and other
> characteristics of the cache. This usually matters only if the cache is direct mapped
> (which is rare these days) *and* the predicate is part of a performance bottleneck.
> 
> All of which means that there is no way to be *sure* which of the above versions
> is faster on a given machine, other than executing and timing both versions.
> I wouldn't worry about; write whichever version you like.
> 
> Zoltan.



More information about the users mailing list