新手问题 和为定值的随机数

bluexuemei · 2015年01月03日 · 最后由 swordray 回复于 2015年01月12日 · 4650 次阅读

比方说 sum=80,我要生成 10 个在 4..13 之间的随机数,求高效算法。(可重复) 注:是 10 个随机数的和为 80 my code:

nums = 10.times.collect {rand(4..13)} until 80== (nums || []).inject(:+)

但是如果随机数的个数很大的话,效率非常低

1.9.3-p545 :010 > sum=80
 => 80 
1.9.3-p545 :011 > 10.times{a = rand(0..sum);b = sum - a;puts "#{a} + #{b}  = #{sum}"}
38 + 42  = 80
55 + 25  = 80
14 + 66  = 80
2 + 78  = 80
69 + 11  = 80
0 + 80  = 80
66 + 14  = 80
74 + 6  = 80
39 + 41  = 80
57 + 23  = 80
 => 10 

这有啥算法?

#2 楼 @badboy ,你误解了我的意思,请再看题目

哦,抱歉,你最开始的描述让我误解了。。。

10 个 4..13 内的随机数并且要求和为 80,那么这 10 个数一定不是等概率分布的。 所以简单来说:

  1. 计算「10 个 4..13 内的随机数并且要求和为 80」情况下每个数的概率分布
  2. 按照这种分布进行随机,随机到第 n 个数时判断前 n 个数加上剩下 (10-n) 个数是否可能和为 80,如果不可能就重新生成第 n 个数。
count = 10
sum = 80
rand_data = 4..13
data_max = (4..13).max
data_min = (4..13).min
data = []
count.times do |i|
  if i < count - 1
    while true do
      ok_data = rand(rand_data)
      use_data = data.inject(:+).to_i
      current_data = sum - use_data
      if((count - i) * data_max >= current_data - ok_data) && (((count - i) * data_min + ok_data ) < current_data) && (sum - use_data - ok_data) < data_max * (count - i - 1)
        data << ok_data
        break
      end
    end
  else
     data << sum - data.inject(:+).to_i
  end

end

puts data.inspect
puts data.inject(:+).to_i
guotekiMacBook-Pro:applnc guo$ ruby number.rb 
[11, 12, 10, 4, 13, 9, 4, 4, 4, 9]
80
guotekiMacBook-Pro:applnc guo$ ruby number.rb 
[12, 10, 12, 13, 7, 4, 4, 4, 5, 9]
80
guotekiMacBook-Pro:applnc guo$ ruby number.rb 
[7, 8, 10, 4, 10, 13, 5, 4, 7, 12]
80
guotekiMacBook-Pro:applnc guo$ ruby number.rb 
[12, 6, 10, 5, 8, 12, 7, 4, 6, 10]
80

帮你堆出来了。。。我不会算法的飘过。。。

Dynamic Programming

def nums n, sum, lb, ub
  if n * ub < sum or n * lb > sum
    return
  end
  if n == 1
    return [sum]
  end
  loop do
    x = rand lb..ub
    res = nums n - 1, sum - x, lb, ub
    if res
      return res << x
    end
  end
end

nums 10, 80, 4, 13

#7 楼 @luikore ,perfect! thank you!

@luikore 我一开始的思路和上面大多数人都差不多,就是每次找出随机产生一个数。但是这样的分布不是很随机,开头的一些数的平均值会大约等于 min,max 的中间值,然后剩下的全部都是 min 或 max。比如可以带入puts nums(100, 10000, 10, 110).sort.inspect试试看

a = []; 10.times { |i| a << rand([4, 80 - a.sum - (10 - 1 - i) * 13].max..[13, 80 - a.sum - (10 - 1 - i) * 4].min) }

注意一下随机数不要超过范围就行了

#9 楼 @dongqs 是的可以调整去掉一部分, 例如这样:

def nums n, sum, lb, ub
  if n * ub < sum or n * lb > sum
    return
  end

  if n == 1
    return [sum]
  end

  # edge sum, we have (1/2)**(n-1) chance to continue
  if n * ub == sum or n * lb == sum
    return if rand(2) == 0
  end

  loop do
    x = rand lb..ub
    res = nums n - 1, sum - x, lb, ub
    if res
      return res << x
    end
  end
end

哇,牛啊,学习学习,偶数学太差,得慢慢看。。。

这玩儿意感觉可以用遗传算法来算,会更平均些。。

#10 楼 @swordray ,能否解释一下你的思路?

#13 楼 @rasefon ,遗传算法?请上代码

#12 楼 @badboy ,同感,我觉得算法太难学了

我也来做做玩玩,先说思路: 使用一个足够大的数字作为引子,生成一个比较大数字组成的数组 big_ary; 这些数字之和不一定等于 80,记为 big_sum; 用 big_sum/sum 得到差距的比值; 如果结果可以是浮点数,用 big_ary 除以比值得到一个 small_ary 已经够了; 但楼主意思应该是要整数数组,因此多一步处理: 先将 small_ary 取整并求和,该和由于舍去小数部分,必然小于 80, 随机挑选 small_ary 中 (80-sum_small) 个数字进行 +1 操作从而补全 80; 这样最终得到的随机数应该是均匀分布的,而且效率在 O(n)

贴代码,主要重视可读性,未在性能上优化

BIG_NUM=1000000
def ran_array_with_sum count, sum
  ary_big = count.times.map{rand BIG_NUM}
  sum_big = ary_big.reduce :+
  proportion = sum_big/sum.to_f
  ary_small = ary_big.map{|i| i/proportion}
  #以下为补全小数
  ary_small_int = ary_small.map{|i| i.to_i}
  sum_small = ary_small_int.reduce :+
  sum_diff = sum - sum_small
  sum_diff.times{ary_small_int[rand count]+=1}
  ary_small_int
end

ran_array_with_sum 10,80

#15 楼 @bluexuemei 过会儿给你,最近比较忙。

$pm = 0.01
$cluster_num = 20
$result = nil

def gen_rands(n, lb, rb)
  rands = []
  n.times do
    r = rand(lb..rb)
    rands << r
  end
  rands
end

def gen_rands_arr(arr_num, n, lb, rb)
  rands_arr = []
  arr_num.times do 
    rands_arr << gen_rands(n, lb, rb)
  end
  rands_arr
end

def fit(sum, all_rands)
  total_p = 0.0
  curr_p_arr = []
  all_rands.each do |rands|
    curr_p = 0.0;
    rands.each { |r| curr_p += r }
    delta = sum-curr_p
    if delta == 0
      $result=rands
      return
    end
    curr_p = 1.0/(delta)
    curr_p = curr_p.abs
    curr_p_arr << curr_p
    total_p += curr_p
  end

  fit_arr = []
  fit_arr << curr_p_arr[0]/total_p
  curr_p_arr.each_index do |i|
    next if i == 0
    fit_arr << fit_arr[i-1] + curr_p_arr[i]/total_p
  end

  fit_arr
end

def ga(sum, n, lb, rb)
  rands_arr = gen_rands_arr($cluster_num, n, lb, rb)
  fit_arr = fit(sum, rands_arr)
  while $result.nil?
    fit_arr = fit(sum, rands_arr)
    return unless $result.nil?
    selector_arr = []
    $cluster_num.times do
      selector_arr << rand()
    end

    selected_rands_indexs = Array.new($cluster_num, 0)
    selector_arr.each do |s|
      # binary search
      low = 0
      high = fit_arr.size
      while low <= high
        middle = (low+high)/2
        if fit_arr[middle] >= s
          high = middle-1
        else
          low = middle+1
        end
      end
      selected_rands_indexs[low] += 1
    end

    next_gen_rands_arr = []
    k = 0
    while k < $cluster_num do
      i = 0
      while selected_rands_indexs[i] == 0
        i += 1
      end
      selected_rands_indexs[i] -= 1

      j = i+1
      while selected_rands_indexs[j] == 0
        j += 1
      end
      j = i if j == $cluster_num
      selected_rands_indexs[j] -= 1

      next_gen_rands1 = []
      next_gen_rands2 = []
      # 随机交叉点
      r1 = rand(0..n-1)
      r2 = rand(0..n-1)
      if r1 > r2
        tmp = r1
        r1 = r2
        r2 = tmp
      end 

      for ii in 0..(r1-1) do
        next_gen_rands1[ii] = rands_arr[i][ii]
        next_gen_rands2[ii] = rands_arr[j][ii]
      end
      for jj in (r2+1)..(n-1) do
        next_gen_rands1[jj] = rands_arr[i][jj]
        next_gen_rands2[jj] = rands_arr[j][jj]
      end
      # 交换基因
      for kk in r1..r2 do 
        next_gen_rands1[kk] = rands_arr[j][kk]
        next_gen_rands2[kk] = rands_arr[i][kk]
      end
      # 忽略修复重复基因

      #新种群
      next_gen_rands_arr[k] = next_gen_rands1
      next_gen_rands_arr[k+1] = next_gen_rands2
      k+=2
    end

    next_gen_rands_arr.each do |r|
      #突变
      pm = rand()
      if pm < $pm
        #发生突变
        #突变点
        pi = rand(0..n-1)
        new_rand = rand(lb..rb)
        r[pi] = new_rand
      end
    end
    rands_arr = next_gen_rands_arr

    rands_arr.each do |rands|
      curr_p=0
      rands.each { |r| curr_p += r }
      delta = sum-curr_p
      if delta == 0
        $result=rands
        return
      end
    end
  end
end

ga 80000, 100, 4, 1300
p $result.sort

#11 楼 @luikore 大数据范围你这类的基本开头都不随机的,几乎都是最大值。。。 例如 nums 100, 80000, 4, 1300

#11 楼 @luikore 例如我跑了 5 遍排了下序,结果是这样的: ➜ ruby ruby tt.rb [20, 32, 47, 49, 119, 153, 160, 200, 201, 221, 222, 260, 271, 272, 297, 303, 304, 314, 333, 350, 352, 358, 366, 399, 400, 400, 402, 467, 492, 501, 511, 514, 532, 553, 573, 588, 607, 646, 649, 655, 670, 691, 720, 725, 750, 757, 762, 770, 792, 800, 822, 828, 841, 848, 850, 874, 904, 942, 988, 996, 1003, 1050, 1060, 1092, 1104, 1115, 1159, 1206, 1207, 1208, 1211, 1223, 1224, 1230, 1234, 1246, 1256, 1265, 1266, 1270, 1278, 1284, 1295, 1297, 1298, 1298, 1299, 1299, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300] ➜ ruby ruby tt.rb [34, 39, 40, 77, 79, 82, 85, 94, 96, 99, 129, 136, 145, 183, 241, 283, 295, 341, 342, 346, 382, 385, 402, 474, 484, 503, 538, 551, 554, 561, 572, 630, 646, 673, 676, 676, 684, 689, 699, 714, 718, 748, 750, 750, 766, 786, 809, 823, 847, 862, 871, 880, 881, 883, 892, 917, 932, 932, 974, 980, 989, 993, 1003, 1019, 1035, 1041, 1055, 1067, 1076, 1077, 1120, 1156, 1173, 1191, 1194, 1205, 1242, 1243, 1248, 1258, 1264, 1293, 1299, 1299, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300] ➜ ruby ruby tt.rb [27, 50, 91, 105, 106, 108, 128, 138, 141, 147, 164, 187, 235, 298, 335, 358, 378, 381, 384, 413, 414, 444, 457, 458, 469, 480, 497, 499, 506, 536, 545, 564, 565, 615, 617, 625, 628, 658, 661, 679, 681, 687, 701, 701, 707, 710, 714, 722, 740, 796, 797, 855, 860, 903, 904, 953, 965, 995, 1005, 1014, 1015, 1036, 1042, 1063, 1068, 1081, 1101, 1105, 1105, 1128, 1162, 1166, 1166, 1167, 1176, 1178, 1193, 1198, 1223, 1256, 1258, 1258, 1271, 1294, 1295, 1296, 1299, 1299, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300] ➜ ruby ruby tt.rb [18, 28, 35, 40, 49, 60, 125, 139, 177, 189, 190, 201, 245, 264, 268, 291, 316, 320, 338, 351, 351, 359, 392, 398, 401, 412, 425, 441, 468, 477, 492, 497, 512, 516, 525, 568, 573, 604, 643, 655, 662, 673, 715, 740, 772, 804, 829, 869, 872, 877, 881, 901, 909, 914, 931, 948, 973, 992, 1001, 1020, 1028, 1057, 1058, 1076, 1115, 1122, 1124, 1124, 1125, 1128, 1145, 1150, 1173, 1243, 1245, 1266, 1285, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300] ➜ ruby ruby tt.rb [9, 10, 37, 42, 85, 137, 139, 171, 183, 184, 199, 210, 235, 236, 271, 278, 298, 307, 329, 339, 345, 360, 366, 393, 401, 404, 435, 477, 527, 575, 575, 585, 591, 609, 629, 638, 646, 647, 648, 671, 725, 726, 758, 768, 769, 777, 781, 788, 814, 819, 822, 838, 865, 872, 893, 896, 897, 904, 916, 936, 946, 1048, 1049, 1050, 1060, 1105, 1137, 1138, 1138, 1156, 1194, 1205, 1208, 1212, 1228, 1241, 1268, 1275, 1276, 1288, 1297, 1298, 1299, 1299, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300, 1300]

如果用遗传算法,效果会好很多,不过同时也慢(存在不确定性):

➜ ruby ruby ga.rb [4, 32, 100, 162, 171, 220, 232, 249, 259, 286, 292, 318, 322, 356, 362, 362, 381, 436, 442, 448, 454, 463, 496, 501, 510, 516, 573, 582, 589, 591, 593, 597, 597, 619, 654, 672, 681, 697, 707, 707, 712, 714, 731, 736, 759, 815, 846, 850, 868, 870, 872, 877, 897, 900, 919, 921, 930, 949, 951, 957, 964, 980, 982, 995, 1002, 1039, 1043, 1044, 1045, 1046, 1069, 1071, 1080, 1082, 1094, 1094, 1109, 1110, 1139, 1141, 1145, 1168, 1173, 1173, 1192, 1201, 1203, 1220, 1221, 1224, 1232, 1240, 1243, 1251, 1258, 1275, 1278, 1282, 1288, 1297] ➜ ruby ruby ga.rb [8, 59, 97, 132, 146, 168, 179, 180, 223, 230, 262, 263, 276, 291, 318, 318, 328, 350, 350, 461, 473, 477, 494, 494, 495, 540, 561, 575, 587, 590, 641, 650, 661, 672, 677, 699, 705, 718, 738, 739, 751, 752, 761, 769, 788, 807, 841, 864, 868, 882, 894, 897, 903, 924, 930, 956, 957, 969, 978, 980, 988, 998, 998, 1002, 1008, 1011, 1040, 1041, 1047, 1056, 1078, 1086, 1094, 1113, 1121, 1133, 1136, 1140, 1153, 1156, 1167, 1170, 1173, 1177, 1179, 1179, 1186, 1189, 1204, 1215, 1217, 1221, 1223, 1225, 1239, 1240, 1248, 1266, 1288, 1299] ➜ ruby ruby ga.rb [22, 122, 135, 167, 184, 227, 232, 234, 237, 265, 278, 290, 330, 352, 379, 388, 393, 401, 405, 424, 439, 464, 503, 505, 508, 517, 527, 553, 555, 557, 561, 588, 592, 601, 608, 624, 652, 659, 676, 684, 717, 719, 726, 738, 747, 759, 768, 786, 792, 894, 904, 917, 926, 929, 932, 934, 949, 952, 954, 961, 979, 982, 989, 999, 1007, 1013, 1024, 1027, 1041, 1052, 1081, 1096, 1097, 1114, 1118, 1120, 1122, 1126, 1166, 1168, 1169, 1174, 1189, 1214, 1215, 1224, 1225, 1226, 1238, 1247, 1249, 1257, 1259, 1268, 1270, 1271, 1275, 1276, 1292, 1299] ➜ ruby ruby ga.rb [146, 157, 159, 160, 184, 192, 234, 247, 272, 295, 361, 384, 391, 399, 418, 435, 435, 472, 494, 524, 553, 556, 562, 565, 575, 593, 602, 611, 630, 631, 640, 642, 661, 663, 669, 678, 695, 699, 713, 717, 721, 729, 753, 753, 766, 778, 782, 802, 814, 818, 831, 833, 836, 851, 867, 879, 881, 887, 894, 905, 914, 952, 959, 965, 969, 973, 995, 1002, 1005, 1019, 1022, 1025, 1031, 1041, 1045, 1064, 1085, 1087, 1104, 1128, 1140, 1155, 1159, 1163, 1165, 1165, 1190, 1191, 1193, 1217, 1222, 1224, 1228, 1228, 1238, 1247, 1257, 1262, 1288, 1289] ➜ ruby ruby ga.rb [16, 26, 38, 64, 155, 170, 203, 227, 231, 233, 241, 270, 270, 283, 294, 309, 350, 374, 396, 398, 404, 447, 451, 454, 490, 509, 516, 523, 544, 573, 609, 610, 620, 683, 685, 691, 719, 722, 733, 783, 786, 787, 791, 796, 799, 812, 825, 841, 846, 850, 877, 910, 920, 933, 933, 946, 947, 958, 959, 984, 990, 1018, 1018, 1018, 1027, 1030, 1050, 1059, 1064, 1078, 1082, 1106, 1111, 1111, 1131, 1134, 1136, 1140, 1158, 1158, 1159, 1160, 1175, 1180, 1183, 1184, 1186, 1190, 1212, 1231, 1233, 1255, 1257, 1262, 1275, 1277, 1280, 1285, 1291, 1292]

如果是求 >=4 的 10 个随机数,就比较简单:

def samples_for_sum(sum, k) # generate k rand values greater than 1, whose sum is 'sum'
  seq = [0] + (1...sum).to_a.sample(k-1).sort + [sum]
  seq.each_cons(2).map{|pre, cur| cur-pre}
end

def sum_samples_at_least(min, sum, k)
  cut = min-1
  samples_for_sum(sum-k*cut, k)
    .map{|i| i+cut}
end

all = sum_samples_at_least(4,80,10)
p all, all.inject(:+)

#14 楼 @bluexuemei 判断一下剩下的数的取值范围不会导致和超过 80,这样的话就可以一次性取成功,复杂度为 O(1)。至于不平均的问题倾斜一下概率就好了,不会增加复杂度。

rasefon 好玩的遗传算法。 提及了此话题。 08月27日 20:15
需要 登录 后方可回复, 如果你还没有账号请 注册新账号