SOME WAYS TO COME UP WITH LESS PERFORMANT ALGORITHMS

Incites other scripters to share some causes behind some less performant algorithms(Mainly about time and memory usage)

  • DoubleX
  • 03/08/2015 02:33 AM
  • 1190 views
Disclaimer:
Performance isn't everything, it isn't even always the most important thing. All the other programming aspects should be taken care of as well. Scripters should have a clear understanding of the importance and priority of each programming aspect in each part of their codes to ensure a working balance. They should also take their own current scripting proficiency into account to ensure none of their codes will ever be out of their control.

Let's start by showing how I came up with less performant algorithms lol

Example 1
Suppose I've to check if a battler's any new or removed states and the checking can be needed at any frame. The result is used by at most 1 method for at most once per frame and the usage timing can be different from that of changing states.

1st algorithm - I write a battler method being called per frame, which checks if the current state array is the same as that in the last frame and returns the result:
def battler_state_change?
  return false if @states == @last_states
  @last_states = @states.clone
  true
end

Performance - The time complexity of this algorithm's at least O(n) and the memory has to read and probably write(when the state array's changed) an array.
It seems acceptable for me at first, but after a while I reviewed this method and asked "Can it be better?". Fortunately, I came up with a more performant algorithm shortly afterwards.

2nd algorithm - I first use a variable to mark that a battler's added or removed a state, then write a method to reset that variable and returns true if that variable's true:
alias clear_states_example clear_states
def clear_states
  clear_states_example
  @state_change = true
end

alias add_new_state_example add_new_state
def add_new_state(state_id)
  add_new_state_example(state_id)
  @state_change = true
end

alias erase_state_example erase_state
def erase_state(state_id)
  erase_state_example(state_id)
  @state_change = true
end

def battler_state_change?
  return false unless @state_change
  @state_change = false
  true
end

Performance - The time complexity of this algorithm's O(1) and the memory has to read and probably write(when the state array's changed) a boolean value.

Performance comparison - The 2nd algorithm is clearly more performant than the 1st one, and the difference of their results can become non-trivial when there are lots of battlers each having lots of states. The fact that both can be run every frame significantly magnifies the difference, which may cause a less powerful machine to use a constant average fps drop to complain XD
Although the 2nd algorithm can be more prone to compatibility issues(like when some scripts that have to be placed below my script have rewritten clear_states, add_new_state or erase_state), they shouldn't be too hard to solve in most cases.
You may argue that the increased number of method calls might outweight what the 2nd algorithm gains, but it'd be true only if states are added and/or removed every several frames, which is extremely unlikely.

How did I come up with the 1st algorithm - It's done by directly translating the requirements into algorithms without even analyzing those requirements beforehand. That direct translation causes me to think that it's necessary to store the state array in the last frame and compare that with that in the current frame as the latter can be changed in any frame.
After analyzing those requirements for a while, however, I realized that if I can catch all the codes that changes the state array, I can just use an instance variable to mark that the state array's changed. It just have to be reset to its default value right after notifying the change to ensure correct notification.

On a side note, if a state's added and removed again before calling battler_state_change? again, algorithm 1 will say the state array didn't change but algorithm 2 will say the state array changed(I don't know if there are some other cases which can also cause this). If this case's non-trivial, then pick the one giving the desired result, as correctness almost always comes before performance, unless all the correct choices have intolerable performance issues(and in this case they don't in general).

Example 2(Slightly modified version of the YSA-CATB Percentage Addon)
Suppose I've to update the position of all the 5 atb bars(back, atb, charge, cooldown, percentage) of each enemy per frame to ensure they're all always placed directly below each enemy's sprite.

1st algorithm - I use the enemy sprite x and y positions to set all that enemy's atb bars directly below that enemy's sprite, and ensure they won't overlap with the status window(mostly the original Yami's algorithm with quite some minor twists, added support to that script and some compatibility issues):
(Both @type and @start_width are fixed once set; update_position is the only method changing the atb bar positions)
#----------------------------------------------------------------------------|
#  Rewrite method: update_position                                           |
#----------------------------------------------------------------------------|
def update_position
  dx = @battler.screen_x - @start_width / 2
  dy = @battler.screen_y
  # Added to set the x and y offset of percentage text
  if @type == :percent
    dx += TEXT_X_OFFSET
    dy += TEXT_Y_OFFSET
  end
  #
  rect.x = dx
  dh = rect.height + 1
  # Rewritten to ensure the percent bar won't overlap with the status window
  dh += 2 unless @type == :back || @type == :percent
  dy = [dy, Graphics.height - dh - 120].min
  dy += 1 unless @type == :back || @type == :percent
  #
  # Added to be compatible with DoubleX RMVXA Enemy MP/TP Bars Addon to YEA-BattleEngine and YEA-EnemyHPBars
  dy -= ENEMY_MP_GAUGE_HEIGHT + ENEMY_TP_GAUGE_HEIGHT if TP_MP_ADDON
  dy -= YEA::BATTLE::ENEMY_GAUGE_HEIGHT if $imported["YEA-EnemyHPBars"]
  #
  rect.y = dy
end # update_position

Performance - Right now it's just too complicated for me to analyze, but my data suggests that the above code's quite expensive.
It' still great though as it does almost nothing redundant and the codes are nearly optimized(as it's mostly the Yami's original algorithm with some minor twists to boost the performance a bit further :)).

2nd algorithm - I first check if the enemy sprite x and y positions are changed, and use the 1st algorithm only if at least 1 of them changed:
(Both @type and @start_width are fixed once set; update_position is the only method changing the atb bar positions)
#----------------------------------------------------------------------------|
#  Rewrite method: update_position                                           |
#----------------------------------------------------------------------------|
def update_position
  dx = @battler.screen_x
  dy = @battler.screen_y
  # Added to update the atb positions only if the battler's position's changed
  return if @last_x == dx && @last_y == dy
  @last_x = dx
  @last_y = dy
  dx -= @start_width / 2
  #
  # Added to set the x and y offset of percentage text
  if @type == :percent
    dx += TEXT_X_OFFSET
    dy += TEXT_Y_OFFSET
  end
  #
  rect.x = dx
  dh = rect.height + 1
  # Rewritten to ensure the percent bar won't overlap with the status window
  dh += 2 unless @type == :back || @type == :percent
  dy = [dy, Graphics.height - dh - 120].min
  dy += 1 unless @type == :back || @type == :percent
  #
  # Added to be compatible with DoubleX RMVXA Enemy MP/TP Bars Addon to YEA-BattleEngine and YEA-EnemyHPBars
  dy -= ENEMY_MP_GAUGE_HEIGHT + ENEMY_TP_GAUGE_HEIGHT if TP_MP_ADDON
  dy -= YEA::BATTLE::ENEMY_GAUGE_HEIGHT if $imported["YEA-EnemyHPBars"]
  #
  rect.y = dy
end # update_position

Performance - It's O(1) and several added memory access to some integer values + that of the 1st algorithm when the enemy sprite position changes, otherwise it's just O(1) and several added memory access to some integer values, unless the screen_x or screen_y readers are rewritten to some insane stuffs.

Performance Comparison - In most cases, it'll perform significantly better than the 1st algorithm, as normally it's quite unlikely that an enemy sprite x and y positions will change frequently(let alone per frame), and the position checking's cheap while the 1st algorithm's expensive. The average fps boosted by at least 2 by switching from the 1st algorithm to the 2nd algorithm in my machine, which can be noticeable.

How did I come up with the 1st algorithm - It's done by directly implementing my percentage addon to YSA-CATB and solving compatibility issues without asking "Can really great codes be even better?". This causes me to merely did some minor optimizations instead of rethinking the Yami's algorithm.
As my percentage addon sucks hard(it still is) and causes a 30+ average fps drop(from roughly 119 to roughly 88) in my machine, I had to optimize every last bit of the addon. Then I realized that all the atb bar positions need to be changed only if the enemy sprite's position's changed, as the 1st algorithm won't change the formers at all unless the latter's changed. So I decided to call the 1st algorithm only if the latter's changed to reduce the amount of redundant code executions.

Example 3(Slightly modified version of (DoubleX)Intercept Magic)
Suppose I've to read some notetags used by that script from the database.

1st algorithm - I check each line against all notetags to be read:
#----------------------------------------------------------------------------|
#  New method: load_notetags_intercept_magic                                 |
#----------------------------------------------------------------------------|
def load_notetags_intercept_magic
  @intercept_tier = 0
  @intercept_chance =100
  @battler_intercept = {}
  @intercept_mp = @intercept_tp = @hit_intercept_mp = @hit_intercept_tp = false
  @note.split(/[\r\n]+/).each { |line|
    if line =~ /<(?:INTERCEPT_TIER|intercept tier):[ ]*(\d+)>/i
      @intercept_tier = $1.to_i if $1.to_i > @intercept_tier
    end
    if line =~ /<(?:INTERCEPT_MP|intercept mp)>/i
      @intercept_mp ||= @intercept_tier > 0
    end
    if line =~ /<(?:INTERCEPT_TP|intercept tp)>/i
      @intercept_tp ||= @intercept_tier > 0
    end
    if line =~ /<(?:HIT_INTERCEPT_MP|hit intercept mp)>/i
      @hit_intercept_mp ||= DoubleX_RMVXA::Intercept_Magic::Intercept_MP_Limit && @intercept_mp
    end
    if line =~ /<(?:HIT_INTERCEPT_TP|hit intercept tp)>/i
      @hit_intercept_tp ||= DoubleX_RMVXA::Intercept_Magic::Intercept_TP_Limit && @intercept_tp
    end
    if line =~ /<(?:ALLY_INTERCEPT|ally intercept)>/i
      @battler_intercept[:ally] = true
    end
    if line =~ /<(?:ENEMY_INTERCEPT|enemy intercept)>/i
      @battler_intercept[:enemy] = true
    end
    if line =~ /<(?:INTERCEPT_CHANCE|intercept chance):[ ]*(\d+)>/i
      @intercept_chance = $1.to_i / 100.0 if $1.to_i < @intercept_chance
    end
  }
end # load_notetags_intercept_magic

Performance - I don't know the performance of reading regular expressions(O(n)? O(n^2)? Just my wild guesses for its time complexity), but it's also determined by the number of lines of the item and that of the notetags to be read. The more notetags to be read, the more time the above algorithm needs. The memory usage depends on what and how many notetags are read, and right now it's too complex for me to analyze.
I was satisfied. It works and theoretically it does what it just need to do.

2nd algorithm - Similar to the 1st one, but once a notetag's read the method jumps right to the next line. Also, I check the most frequent notetag first:
#----------------------------------------------------------------------------|
#  New method: load_notetags_intercept_magic                                 |
#----------------------------------------------------------------------------|
def load_notetags_intercept_magic
  @intercept_tier = 0
  @intercept_chance =100
  @battler_intercept = {}
  @intercept_mp = @intercept_tp = @hit_intercept_mp = @hit_intercept_tp = false
  @note.split(/[\r\n]+/).each { |line|
    case line
    when /<(?:INTERCEPT_TIER|intercept tier):[ ]*(\d+)>/i
      @intercept_tier = $1.to_i if $1.to_i > @intercept_tier
    when /<(?:INTERCEPT_MP|intercept mp)>/i
      @intercept_mp ||= @intercept_tier > 0
    when /<(?:INTERCEPT_TP|intercept tp)>/i
      @intercept_tp ||= @intercept_tier > 0
    when /<(?:HIT_INTERCEPT_MP|hit intercept mp)>/i
      @hit_intercept_mp ||= DoubleX_RMVXA::Intercept_Magic::Intercept_MP_Limit && @intercept_mp
    when /<(?:HIT_INTERCEPT_TP|hit intercept tp)>/i
      @hit_intercept_tp ||= DoubleX_RMVXA::Intercept_Magic::Intercept_TP_Limit && @intercept_tp
    when /<(?:ALLY_INTERCEPT|ally intercept)>/i
      @battler_intercept[:ally] = true
    when /<(?:ENEMY_INTERCEPT|enemy intercept)>/i
      @battler_intercept[:enemy] = true
    when /<(?:INTERCEPT_CHANCE|intercept chance):[ ]*(\d+)>/i
      @intercept_chance = $1.to_i / 100.0 if $1.to_i < @intercept_chance
    end
  }
end # load_notetags_intercept_magic

Performance - Similar to the 1st one, but now it also depends on how many notetags are checked before a notetag's read.

Performance Comparison - Unless there's no notetag at all or the only notetag being read is the least frequent one, the 2nd algorithm will perform at least slightly better than the 1st one, as now not every line needs to check against all notetags, they just have to check until a notetag's read. The performance difference can be non-trivial when there are lots of notetags to be read.

How did I come up with the 1st algorithm - It's done by not thinking the user side of the story thoroughly enough. Theoretically, I've to check every line against every notetag as I've no way to ensure all users of my script will always put at most 1 notetag per line. But in practice, almost all users will just put at most 1 notetag per line unless the scripts ask otherwise. If you still feel unsafe, you can even ask users to put at most 1 notetag per line and nearly all users should have no problem with it. So when a notetag's read, I'm almost certain that there will be no more notetags in the same line, so continue checking against the rest is just redundant. It's better to jump to the next line instead. Checking the most frequent notetag first gives that script the highest chance to check against the least notetags, thus further boosting the performance.

There may be(and extremely likely are) many, many other specific reasons and/or general practices behind less performant algorithms. Any specific example or general thought about this to share here?

P.S.: I won't claim any of my 2nd algorithm is the best or even ideal, as I just don't know if there are even better ones. I'd be glad if you'd tell me some better algorithms(if any) for the above examples.