python list与numpy数组效率对比-编程学习网

前言

因为经常一训练就是很多次迭代，所以找到效率比较高的操作能大大缩短运行时间，但这方面资料不足，所以自己记录总结一下，有需要再补充

索引效率与内存占用比较

有时候我需要一个数组，然后可能会频繁从中索引数据，那么我选择list还是numpy array呢，这里做了一个简单的实验进行比较，环境python 3.6

import random
import numpy as np
import time
import sys
# import matplotlib
# matplotlib.use('agg')
import matplotlib.pyplot as plt
from collections import deque

start = time.time()
length = []

list_size = []
array_size = []
deque_size = []

list_time = []
array_time = []
deque_time = []

for l in range(5, 15000, 5):
    print(l)
    length.append(l)
    a = [1] * l
    b = np.array(a)
    c = deque(maxlen=l)
    for i in range(l):
        c.append(1)

    # print('list的size为：{}'.format(sys.getsizeof(a)))
    # print('array的size为：{}'.format(sys.getsizeof(b)))
    # print('deque的size为：{}'.format(sys.getsizeof(c)))
    list_size.append(sys.getsizeof(a))
    array_size.append(sys.getsizeof(b))
    deque_size.append(sys.getsizeof(c))

    for i in range(3):
        if i == 0:
            tmp = a
            name = 'list'
        elif i == 1:
            tmp = b
            name = 'array'
        else:
            tmp = c
            name = 'deque'

        s = time.time()
        for j in range(1000000):
            x = tmp[random.randint(0, len(a)-1)]
        duration = time.time() - s

        if name == 'list':
            list_time.append(duration)
        elif name == 'array':
            array_time.append(duration)
        else:
            deque_time.append(duration)

duration = time.time() - start
time_list = [0, 0, 0]
time_list[0] = duration // 3600
time_list[1] = (duration % 3600) // 60
time_list[2] = round(duration % 60, 2)
print('用时：' + str(time_list[0]) + ' 时 ' + str(time_list[1]) + '分' + str(time_list[2]) + '秒')

fig = plt.figure()

ax1 = fig.add_subplot(211)
ax1.plot(length, list_size, label='list')
ax1.plot(length, array_size, label='array')
ax1.plot(length, deque_size, label='deque')
plt.xlabel('length')
plt.ylabel('size')
plt.legend()

ax2 = fig.add_subplot(212)
ax2.plot(length, list_time, label='list')
ax2.plot(length, array_time, label='array')
ax2.plot(length, deque_time, label='deque')
plt.xlabel('length')
plt.ylabel('time')
plt.legend()

plt.show()

对不同大小的list，numpy array和deque进行一百万次的索引，结果为

可以看出，numpy array对内存的优化很好，长度越大，其相比list和deque占用内存越少。

list比deque稍微好一点。因此如果对内存占用敏感，选择优先级：numpy array>>list>deque。

时间上，在15000以下这个长度，list基本都最快。其中

长度<1000左右时，deque跟list差不多，选择优先级：list≈ \approx≈deque>numpy array;
长度<9000左右，选择优先级：list>deque>numpy array;
长度>9000左右，选择优先级：list>numpy array>deque;

不过时间上的差距都不大，几乎可以忽略，差距主要体现在内存占用上。因此如果对内存不敏感，list是最好选择。

整个实验使用i7-9700，耗时2.0 时 36.0分20.27秒，如果有人愿意尝试更大的量级，更小的间隔，欢迎告知我结果。

添加效率比较

numpy的数组没有动态改变大小的功能，因此这里numpy数据只是对其进行赋值。

import numpy as np
import time
from collections import deque

l = 10000000
a = []
b = np.zeros(l)
c = deque(maxlen=l)
for i in range(3):
    if i == 0:
        tmp = a
        name = 'list'
    elif i == 1:
        tmp = b
        name = 'array'
    else:
        tmp = c
        name = 'deque'

    start = time.time()
    if name == 'array':
        for j in range(l):
            tmp[j] = 1
    else:
        for j in range(l):
            tmp.append(1)
    duration = time.time() - start
    time_list = [0, 0, 0]
    time_list[0] = duration // 3600
    time_list[1] = (duration % 3600) // 60
    time_list[2] = round(duration % 60, 2)
    print(name + '用时：' + str(time_list[0]) + ' 时 ' + str(time_list[1]) + '分' + str(time_list[2]) + '秒')

结果为：

list用时：0.0 时 0.0分1.0秒
array用时：0.0 时 0.0分1.14秒
deque用时：0.0 时 0.0分0.99秒

可以看出，只有在非常大的量级上才会出现区别，numpy array的赋值是最慢的，list和deque差不多。

但平时这些差距几乎可以忽略。

总结

以上为个人经验，希望能给大家一个参考，也希望大家多多支持编程网。

文章详情

python list与numpy数组效率对比

目录

前言

索引效率与内存占用比较

添加效率比较

总结

软考中级精品资料免费领

相关文章

猜你喜欢

python list与numpy数组效率对比

大数组元素差异removeAll与Map效率对比

python中的Numpy二维数组遍历与二维数组切片后遍历效率比较

大数组元素差异removeAll与Map效率源码对比分析

JS数组循环的方式以及效率分析对比

NumPy中的Python对象如何提高数据处理效率？

Python与sed,grep文本查找效率对比的示例分析

PHP 数组键和值互换：不同算法的效率对比

Python 容器教程：NumPy 是如何提高数组处理效率的？

MySql数据库单表查询与多表连接查询效率对比

PHP数组深度复制的性能效率：不同方法的对比分析

剖析PHP数组深度复制方法：效率、复杂性和适用性对比

Python中的数据类型：如何使用numpy对象来提高代码效率？

Pandas使用Merge与Join和Concat分别进行合并数据效率对比分析

MySQL的二进制日志与HBase的WAL在数据恢复中的效率对比

Linux 数组与 Python 函数的结合使用，如何提高代码效率？

如何在Python中使用数组容器对象来提高代码效率？

Bash命令如何与Python NumPy函数协同工作，以提高您的工作效率？